Note: The entire code can be found (and loved) here
I’m a musician, and a data scientist.
I spend my days writing codes and studying statistical theorems. Then (when it gets dark outside) I like to write music.
But it is possible to write music while coding?
The answer is yes, and I’m about to show you how.
Note: If you are interested on the entire process, I want to marry you. If this is not the case, you can skip to the 4., 5. and 6. points to see the algorithm and its results.
Note: This article is a part of a bigger study. You can find here and love it till death do you guys part.
The majestic gas ball that is the Sun has crucial effects in our lives. We know and live this life thanks to this star and it is close enough to have magnetic and deep study about it too.
In particular, solar flares are sudden flashes that occur on the Sun. You may think that in your life you have more serious stuff to think about, and you are probably right. …
Note: This report is part of a bigger project about climate change that can be seen (and hopefully loved) in this GitHub repository.
When some time ago I’ve heard from my television about the last Australian bushfire season, it was really terrible to hear. I was on another project at that time, but I’ve put this on my to-do-list, promising to myself that I would have worked on that to get an unbiased and data-driven opinion about climate change.
Let’s get started.
0. The libraries
Here’s the collection of the libraries that have been used. Additional libraries may be required during the story, but all the libraries can be found in the notebooks that are reported. …
Note: This article is a part of a bigger project that can be seen and hopefully loved in this GitHub repository
Let’s pretend for a second that Machine Learning models are real human beings: none of them is perfect (besides you, of course).
Some models could be too anxious, someone too jealous, someone too arrogant. The real magic happens when you fall in love with someone that is able to see your weak points and helps you improve them, and he/she emphasises your good sides. This is the exact idea of Ensemble Learning.
In fact, it is based on the idea that a wide collection of models is able to perform better than each model that is taken in isolation. …
Every data scientist, especially the ones that find themselves to work with Big Data, knows the importance of dimensionality reduction. If you have a dataset that has a large amount of columns and you have a Machine Learning task to complete:
So it is important to know and understand some dimensionality reduction techniques and one of the most famous one is the Principal Component Analysis (P.C.A.).
This algorithm projects your data into another dimension, but with lower dimensionality. Speaking in simple terms, it reduces the number of columns. The disturbing fact is that if you start with a dataset that is readable and easy to interpret, with almost (hey, I said almost) any probability you will end up with a reduced number of column, but they are not easy to understand at all. …
Noise is so difficult to treat, every data scientist knows that.
The fact is that, as one dear friend of mine loves to say,
“The hardest part of getting what you want is figuring out what it is”
Indeed, we can’t specify what noise really is. As a physicist, I find myself in the situation of studying a dataset and trying to understand if my data has a physical sense. When a clear pattern can’t be identified in a part of my data (or my signal) , I tend to classify that part as “noise”. But, this approach could be dangerous and misleading. …
One of the sentences that my professor used to say in high school was that “History repeats itself” . The sense of this sentence is obviously related to the fact that we should learn from history, thus being able not to do the same mistakes that we have done in the past.
Now let’s talk about science. If you have a time series, then you have your data for a (preferably) long time. Let’s assume for a second that history actually does reproduce itself. That will mean that by simply replicating the signal you will extend your data, thus obtaining a twice longer dataset. I know what you’re thinking, and you are right, I’m fooling you. Indeed, what I’m essentially mentioning in these few lines is the Fourier Transform, that states that each periodic signal can be seen as a sum of sines and cosines. …
If you are a data scientist, one of your typical task is to analyze a certain signal and find its peaks. This may be your goal for a ton of reasons, but the bigger one is that peaks somehow tend to show a certain property of your signal. It is not surprising that a library that helps you finding peaks does already exist in Python, and it is SciPy (with its find_peaks function). …
When you are a kid, sometimes your overprotective mother makes you feel beautiful, smart and kind. Of course, if you are one of those kids, you are confident that everyone thinks exactly the same thing that your mother thinks about you, but when you grow up and you go to school, sometimes your teacher tells you that you are acting wrong, and you are not so kind or smart or beautiful! In that moment you need to realize that maybe your mother loves you too much and gave you a false impression of yourself: your model is overfitting :)
If we think in Machine Learning terms, we find ourselves in an ‘overfitting’ scenario when our computational power is higher than how much it is required for our specific task. In other words, our algorithm is able to have good performances in our specific dataset, but it is not able to generalize the task for a dataset that it has never seen. …
Descrizione di implementazioni di tecniche di Machine Learning in grado di distinguere i film di successo da quelli non apprezzati dal pubblico.
Cominciamo con una doverosa premessa. Non esistono film belli o film brutti. Quando si tratta di un film si considera una creazione artistica e, come tale, essa non è valutabile se non in modo strettamente personale. In una sua meravigliosa citazione, Meryl Streep afferma addirittura di avere una teoria per cui i film lavorano nello stesso piano dei nostri sogni. …