Last week I participated in the ECML-PKDD 2020 Conference. The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases is one of the most recognized academic conferences on ML in Europe.
In the spirit of spreading the word about ML developments, I wanted to share my selection of the best “applied data science” papers from the conference. It is the second post from this series. The previous one about top research papers, can be found here. …
Based on simply watching how an agent acts in the environment it is hard to tell anything about why it behaves this way and how it works internally. That’s why it is crucial to establish metrics that tell WHY the agent performs in a certain way.
This is challenging especially when the agent doesn’t behave the way we would like it to behave, … which is like always. …
Image processing is a very useful technology and the demand from the industry seems to be growing every year. Historically, image processing that uses machine learning appeared in the 1960s as an attempt to simulate the human vision system and automate the image analysis process. As the technology developed and improved, solutions for specific tasks began to appear.
The rapid acceleration of computer vision in 2010, thanks to deep learning and the emergence of open source projects and large image databases only increased the need for image processing tools. …
In machine learning (ML), generalization usually refers to the ability of an algorithm to be effective across various inputs. It means that the ML model does not encounter performance degradation on the new inputs from the same distribution of the training data.
For human beings generalization is the most natural thing possible. We can classify on the fly. For example, we would definitely recognize a dog even if we didn’t see this breed before. Nevertheless, it might be quite a challenge for an ML model. …
Last week I had the pleasure to participate in an ECML-PKDD 2020 Conference. The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases is one of the most recognized academic conferences on ML in Europe.
Fully online event, run around the clock — nice idea to make it accessible in all time zones. Conference schedule, neatly divided into many tracks on various flavours made it simple to dive into my favourite topics in reinforcement learning, adversarial learning and meta-topics.
ECML-PKDD brings a large number of new ideas and inspiring developments in the ML field, so I wanted to pick top papers and share them here. …
There is a common business saying that you can’t improve what you don’t measure. This is true in machine learning as well. There are various tools for measuring the performance of a deep learning model: Neptune AI, MLflow, Weights and Biases, Guild AI, just to mention a few. In this piece, we’ll focus on TensorFlow’s open-source visualization toolkit TensorBoard.
The tool enables you to track various metrics such as accuracy and log loss on training or validation set. As we shall see in this piece, TensorBoard provides several tools that we can use in machine learning experimentation. …
In software development, Continuous Integration (CI) is a practice of merging code changes from the entire team to the shared codebase often. Before any new code can be merged it is tested and checked for quality automatically.
CI makes the codebase up-to-date, clean, and tested by design and helps to find any problems with it quickly.
The way I see it:
Continuous Integration in machine learning extends the concept to running model training or evaluation jobs for each trigger event (like merge request or commit).
This should be done in a way that is versioned and reproducible to ensure that when things are added to the shared codebase they are properly tested and available for audit when needed. …
JupyterLab, a flagship project from Jupyter, is one of the most popular and impactful open-source projects in Data Science. One of the great things about Jupyter ecosystem is that if there is something you are missing, there is either an open-source extension for that or you can create it yourself.
In this article, we’ll talk about JupyterLab extensions that can make your machine learning workflows better.
As folks from JupyterLab say:
“JupyterLab is designed as an extensible environment”.
JupyterLab extension is simply a plug-and-play add-on that makes more of the things you need possible. …
“Artificial intelligence will reach human levels by around 2029. Follow that out further to, say, 2045, and we will have multiplied the intelligence — the human biological machine intelligence of our civilization — a billion-fold.”
– Ray Kurzweil, American inventor and futurist.
As we know, “Data” is the new power, and companies around the globe are trying to leverage this power in their businesses. Whether that business is:
Training machine learning or deep learning models can take a really long time.
If you are like me, you like to know what is happening during that time:
Neptune lets you do all that, and in this post I will show you how to make it happen. Step by step.
Check out this example run monitoring experiment to see how this can look like. …