Algorithms and Software for Text Classification

Saturday, October 22, 2022

Text classification is a well developed area that has been successfully applied in many applications. However, we found that tools for easily and conveniently solving users’ problems are still somewhat lacking. Recently, we developed a tool LibMultiLabel for both binary/multi-class and multi-label text classification. It supports end-to-end services from raw texts to final evaluation/analysis. Common learning techniques as well as easy hyper-parameter selection are included. Unfortunately, due to the many considerations of practical applications (e.g., selection of evaluation criteria, strategies for data with/without many labels, etc.), we haven’t had a good recipe yet for guiding users to effectively solve all their problems. In our on-going efforts for achieving this goal, we find that the inappropriate use of machine learning methods is now a big concern. We share some interesting stories and discuss the importance of helping users to appropriately use machine learning techniques.

Speaker/s

Text classification is a well developed area that has been successfully applied in many applications. However, we found that tools for easily and conveniently solving users’ problems are still somewhat lacking. Recently, we developed a tool LibMultiLabel for both binary/multi-class and multi-label text classification. It supports end-to-end services from raw texts to final evaluation/analysis. Common learning techniques as well as easy hyper-parameter selection are included. Unfortunately, due to the many considerations of practical applications (e.g., selection of evaluation criteria, strategies for data with/without many labels, etc.), we haven't had a good recipe yet for guiding users to effectively solve all their problems. In our on-going efforts for achieving this goal, we find that the inappropriate use of machine learning methods is now a big concern. We share some interesting stories and discuss the importance of helping users to appropriately use machine learning techniques.

Related