A1074
Title: Feature selection for high-dimensional, high-cardinality data in NFT valuation: A case study of Azuki
Authors: Geun-Cheol Lee - Konkuk University (Korea, South) [presenting]
Hoon-Young Koo - Chungnam National University (Korea, South)
Heejung Lee - Hanyang University (Korea, South)
Abstract: Non-fungible tokens (NFTs), particularly profile picture (PFP) collections, are characterized by high-dimensional and high-cardinality feature spaces. Datasets of PFP NFTs typically consist of around ten attributes, each with dozens of unique values, yielding potentially hundreds of features for valuation models. The aim is to propose a feature selection procedure to incorporate both first- and second-order features into a hedonic regression model. Including squared and interaction terms as second-order features can increase the number of independent variables up to tens of thousands. Using features selected by the proposed procedure, a final regression model is developed for PFP NFT valuation. With real transaction and attribute data from the Azuki PFP NFT collection, computational experiments are conducted to validate the performance of the proposed model. Comparative experiments employ existing machine learning methods by training on a subset of data and evaluating explanatory power on validation data. Findings demonstrate the efficacy of tailored feature selection strategies in managing high-dimensional, high-cardinality data, providing insights for predictive valuation modeling in NFT markets.