Skip to main content
CoC

From Dataset to Features: A Python-Based Evolutionary Approach

Duration:
60 minutes

Abstract

Multilabel classification is a machine learning task in which each instance is assigned to a group of labels. It has gained widespread use in various applications in recent years. Preprocessing, such as feature selection, is an important step in any machine learning or data mining task. It helps to improve the performance of an algorithm and reduce computational time by eliminating highly correlated, irrelevant, and noisy features. A new algorithm called Black Hole, inspired by the phenomenon of black holes, has recently been developed to tackle multi-label classification problems. In this talk, we present a modified version of the Black Hole algorithm that combines it with two genetic algorithm operators: crossover and mutation. The combination of Black Hole and genetic algorithms has the potential to solve multi-label classification problems across a range of domains.

PosterPyData: Deep Learning, NLP, CV

Description

Multilabel classification is a machine learning task where each instance in a dataset is assigned to multiple labels. This is in contrast to traditional classification, where each instance is assigned to a single label. Multilabel classification has gained popularity in recent years due to its expanding use in a variety of applications across domains.

One of the key challenges in multilabel classification is the high dimensionality of the data, which can make it difficult for machine learning algorithms to learn effectively. This is where feature selection comes in. Feature selection is the process of identifying and selecting a subset of relevant and non-redundant features from a larger set of features. It is a critical preprocessing step that can improve the performance and efficiency of machine learning algorithms.

One of the recent developments in feature selection is the Black Hole algorithm, which is inspired by the phenomenon of black holes in space. The Black Hole algorithm is a metaheuristic that iteratively removes the least relevant features from a dataset, based on a relevance measure such as mutual information. In this talk, we present a modified standalone Black Hole algorithm that incorporates genetic algorithm operators, such as crossover and mutation, to improve its performance in solving multilabel classification problems. The hybridization of Black Hole and Genetic Algorithms has shown to be effective in solving multilabel classification problems in different domains.


The speakers

Neeraj Pandey

Neeraj Pandey

Neeraj is a polyglot. Over the years, he has worked on a variety of full-stack software and data-science applications, as well as computational arts and Quantitative finance projects, and likes the challenge of creating new tools and applications.