EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0523
Title: On a SAVE and DR for large scale dataset Authors:  Chaehyun Ryu - Ewha Womans University (Korea, South) [presenting]
Abstract: Directional regression is an effective dimension reduction approach for capturing inherent characteristics in regression problems. The idea of sliced average variance estimation and directional regression is extended to handle massive datasets. In particular, a "divide and conquer" strategy is adopted, breaking down the dataset into manageable chunks, and subsequently merging the results based on the proximity between dimension reduction subspaces. The capabilities of capturing distance is further harnessed to significantly enhance computational efficiency and optimize memory usage. The competitiveness of the approach is demonstrated through a comprehensive numerical study, and its application to a real-world dataset is demonstrated. In both simulation and application, R packages "foreach" and "bigmemory" are utilized for optimizing the execution speed and managing the memory when dealing with a massive dataset. The comparison between the proposed methodology, BIG-SAVE and BIG-DR, and the existing method, namely BIG-SIR, was conducted with a focus on computational speed and accuracy. The application of the methods to real datasets demonstrates its practical applicability and versatility.