Introducing Scikit-P4: a simple library for Python

I’ve just released Scikit-P4, a small Python library that computes the P4 metric for binary, multiclass, and multilabel classification – using an API that mirrors scikit-learn’s metrics (think f1_score, but for P4).

Continue reading “Introducing Scikit-P4: a simple library for Python” →

Meet P4 metric – new way to evaluate binary classifiers

Introduction

Binary classifiers are accompanying us on a daily basis. Tests that detect disease, give us the answer: positive/negative, spam filters say spam/not spam, smartphones that authenticate us based on a face scan or fingerprint – make a known/unknown decision. The question: how to evaluate the efficiency of such a classifier does not seem extremely complicated. Just choose the one that will predict the most cases correctly. As many of us have already realized – the actual evaluation of a binary classifier requires somewhat more sophisticated means. But we’ll talk about that in a moment.

Continue reading “Meet P4 metric – new way to evaluate binary classifiers” →

Extending F1 metric, probabilistic approach

Abstract

This article explores the extension of well-known F₁ score used for assessing the performance of binary classifiers. We propose the new metric using probabilistic interpretation of precision, recall, specifcity, and negative predictive value. We describe its properties and compare it to common metrics. Then we demonstrate its behavior in edge cases of the confusion matrix. Finally, the properties of the metric are tested on binary classifier trained on the real dataset.

Keywords: machine learning, binary classifier, F₁ , MCC, precision, recall

Continue reading “Extending F1 metric, probabilistic approach” →

Binary classifier metrics

Have you ever wanted to develop a better intuition for measuring the performance of a binary classifier? Precision, recall, accuracy, specificity, F1… Now you have all these metrics under your fingers in the Performance Metrics Playground. You can control your population parameters – number of positive and negative samples, as well as the simulated classifier parameters – number of true positives and true negatives.

Continue reading “Binary classifier metrics” →

Image Recognition and Linear Regression

In the following article, we will look at image recognition using linear regression. We realize that this idea may seem quite unusual. However, we will show using a simple example, that for a certain class of images, and under quite strictly defined circumstances, the linear regression method can achieve surprisingly fair results.

Continue reading “Image Recognition and Linear Regression” →

RPN Calculator

Reverse Polish Notation is a method of notation of mathematical expressions that allows simple calculations to be performed without the need of using brackets – thanks to the use of a stack. This method has been popularized by Hewlett Packard, which has been successfully using it in its calculators for many years.

The Artinu application is a simple RPN calculator, written in JavaScript, available online.

Continue reading “RPN Calculator” →

Biased and unbiased estimators

When we want to know a standard deviation of a big population, we usually take a sample from the whole and than calculate estimator value. However it is not always clear which estimator should we use. Sometimes people argue whenever biased or unbiased standard deviation estimator is better. Below we explore this field and present the result of the numerical simulation.

Continue reading “Biased and unbiased estimators” →

Czy sztuczna inteligencja zna się na mechanice kwantowej?

For english abstract, click here

We explore the possibility of the K-Means algorithm usage for cleaning scans of hand-made notes. A Scikit Learn implementation of the algorithm is used. The original image is decomposed into three clusters in RGB space. Finally we got cleaned picture as the result of removing 2 of 3 clusters from the original one. References:

Original image: picture-before-cleaning.jpg
Image after cleaning: picture-after-cleaning.jpg
Source code: decompose_image.py

Jakiś czas temu, podczas porządkowania szafy wpadły mi w ręce, moje stare szpargały. Notatki z wykładów z mechaniki kwantowej, które to notatki jako student w latach 90-tych skrzętnie prowadziłem. Gdy już się nacieszyłem wspomnieniami zacząłem się zastanawiać czy nie dałoby się nieco poprawić ich wyglądu, oczyścić ze zbędnych elementów. Na każdej stronie widnieje niebiesko-blada kratka, dodatkowo pojawiają się przebitki atramentu z drugiej strony kartki. Widoczne są również otwory na wpięcie do segregatora.

Continue reading “Czy sztuczna inteligencja zna się na mechanice kwantowej?” →

The limits of central limit theorem – part 2

In the previous part we made look through the distribution of sample means for three distributions: Uniform, Cauchy, and Petersburg distribution. The Cauchy and Petersburg distributions do not fulfill the Central Limit Theorem since they have infinite variance (and infinite expected value in “Petersburg” case). Now we will have a look at the numerical results for standard deviation of sample means. As in previous part, we use Uniform distribution only as a reference since it fulfills CLT and we use the same pseud-random number generator (Mersenne-Twister).

Continue reading “The limits of central limit theorem – part 2” →

The limits of central limit theorem

The power of Central Limit Theorem is widely known. In the following post we are exploring a bit the areas outside its scope – where the CLT does not work. We present the results of numerical simulations for three distributions: Uniform, Cauchy distribution, and certain “naughty” distribution called later “Petersburg distribution”.

Continue reading “The limits of central limit theorem” →

Orange Attractor

Fun, Math and Programming

Introducing Scikit-P4: a simple library for Python

Meet P4 metric – new way to evaluate binary classifiers

Introduction

Extending F1 metric, probabilistic approach

Abstract

Binary classifier metrics

Image Recognition and Linear Regression

RPN Calculator

Biased and unbiased estimators

Czy sztuczna inteligencja zna się na mechanice kwantowej?

The limits of central limit theorem – part 2

The limits of central limit theorem

Search

Recent

Recent Comments

Archives

Categories

Meta