Posts

Showing posts from 2021

RuntimeError: Building llvmlite requires LLVM 11.x.x, got '7.0.1'.

 I researched recently transformers library, especially Wa2Vec2 model. The main goal was to build a prototype of automatic speech recognition (ASR) on RaspberryPI 4 (RSP). There were a few caveats how to make the whole ecosystem operational. However here I would like to describe how to deal with quite common error related to Numba installation. Numba and llvmlite packages are dependencies for sklearn package, which is quite often used in machine learning world.  In this topic I will describe how to install Numba and llvmlite as well as how to deal with the next error: RuntimeError: Building llvmlite requires LLVM 11.x.x, got '7.0.1'. Be sure to set LLVM_CONFIG to the right executable path. First of all, let me explain what is the problem. While installing Numba Python compiles it's code and for this purpose it needs LLVM compiler. In the error message the build process notifies us that it has found not satisfying version of LLVM compiler. To check your version you can...

Optax - optimization library for JAX

JAX based libraries become more and more popular. I think guys from Google (Deepmind) picked right direction to optimize numpy functionality and build a top the whole ML infrastructure. As example of the good picked direction we have Optax  Python library which has different optimizers for JAX based solutions. The repository has good introduction examples (e.g. this one ), however good knowledge of JAX is required.  More on JAX

pcm2float

 I would like to share different implementations of Pulse-code modulation (PCM) to float conversion. Usually we receive PCM in int16 type, but some models expect normalized float64. The logic for conversion is not comlex and even so I have managed to find so interesting options. The basic one logic looks like this: def pcm2float(sig, dtype=np.float64):s sig = np.asarray(sig) # make sure it's a NumPy array assert sig.dtype.kind == 'i', "'sig' must be an array of signed integers!" dtype = np.dtype(dtype) # allow string input (e.g. 'f') # Note that 'min' has a greater (by 1) absolute value than 'max'! # Therefore, we use 'min' here to avoid clipping. return sig.astype(dtype) / dtype.type(-np.iinfo(sig.dtype).min) The code is taken from StackOverflow answer on a question  SciPy wavfile: music in, garbage out ? What it does? I think the most important is in the line. Let's split it on the log...

Wavelets

 Sometime ago I participated in BI data science competition. The main task was to find the best algorithm for audios with velcro crackle classification. The winner was algorithm which used wavelet transform. This link leads to a good article about wavelets using Python ( GitHub repo ).

Just a picture

Image
 Just a picture, drawn by me, illustrating a complex process ML models deployment for the company biochemists.

Install Onnxruntime on RaspberryPI 4

 Currently I have found two ways to get ONNX runtime on RaspberryPI . The first one is a Docker image, and the second one is a compilation on own RaspberryPI. I prefer the second one ( link ). Everything is nicely described but I think it is important to know more tools. 1. screens - Some of the compilation commands can run multiple hours, such that your ssh connection can be interrupted, and you will lose a process. To avoid this, you can use screen ( good tutorial link ) 2. I use Jupyter to test my solutions, as well as virtual environments. To make virtual environment accessible as Jupyter kernel, please check this article . 3. Maybe you will need system packages in you virtual environment. In this case next advice from StackOverflow helps: Create the environment with  virtualenv --system-site-packages  . Then, activate the virtualenv and when you want things installed in the virtualenv rather than the system python, use  pip install --ignore-installed  or...

UX maturity and values

Image
  Interesting topic about UX maturity was raised by Natalie Hangson in her blog post  UX Maturity Models – A Collection . She listed all known maturity schemas and characterized them from company management point of view. The whole topic, from my take is slightly political, and usually takes a lot of time to pick right model for a company. however from educational point I would recommend this topic. The second, UX at scale values TED presentation maid by Margaret Gould is super motivating. I would suggest not only UX designers but all specializations working on a product.

D3.js alternatives

 Recently looked for a good alternatives for D3.js. I had some experience with it and find the library awesome. D3.js gives a lot of power, but as usual with power comes responsibility to develop charts from very basic components. It gives a lot of flexibility, however usually I often use general charts. So I decided to create a priority list of charts javascript libraries.   1. plotly.js   - just love it. I use Plotly Python package as well. Interface is understandable. 2. ApexCharts.js  - this library I would select second. It is simple to use, has nice color scheme by default. 3. Nivo.js  - super cool library., has a lot of components. Supports SVG and Canvas. 3. Chart.js  - simple library easy to use. 4. Google Charts  - charts library from Google. Plenty of options. Google's style :) 3. JS 3. 

XCode on Linux

Image
 Just navigating the Internet found to interesting solutions: The first on is Sosumi . It comes as Snap package.  Project on GitHub The second project, is Docker-OSX . Project on GitHub

RaspberryPi guys

 Good source of different useful information in RaspberryPi kingdom.

Optuna good tool for hyperparameter optimization

 Few weeks ago found in the Internet presentation of  Optuna hyperparameter optimization framework. Optuna offers a row of cool features and I think, it could be in the toolkit of every data practitioner. This framework has ability ea eagerly search, crafted search algorithms and, of course, parallelization.  Good article . Optuna's competitor is Orion project, created for black box testing of models.

FHIR

 I 've been learning FHIR standard, because we gonna to build patient facing application. For a person not from US it looks like quite a hard task from the very beginning. Fortunately I have managed to find good resources. One of them was site patient.dev . This site is created by medical doctor who has also good knowledge in programming. He has illustrated practically process of building small front-end applications which communicate with different FHIR vendors (Epic, Cerner,...).

Kafka cluster locally - no pain

 Recently developed distributed data science platform using service choreography architectural pattern. Solution assumed Kafka cluster as a message (event) stream provider. I tried different solutions for local Kafka cluster build for docker-compose, but the best one is from Vinsguru

In search for good ASR model

Image
 Recently I was researching models for automatic speech recognition (ASR). It seams that recently this area strives and provides us already with very interesting results. On the current moment the most popular trends are: 1. Acoustic model + language model 2. Transformers There is also trend to use CNN as more precise models, but I wonk touch it here. I think one of the important parts which is often missed is speech cancelation model - I won't concentrate on it too. Metrics The main metric used to measure ARS model efficiency is WER (word error rate). It is relation between a sum of word insertions (I), deletions (D) and substitutions (S) divided by number of words in the ground true text (N). WER = (I + D + S) / N Basically all model creators declare WER's of their models. Acoustic models I think the most common example of such a models is DeepSpeech . This framework is maintained under Mozilla foundation "wing". It provides one acoustic model and one language model...

How to pick a good dentist in Germany

 I have been living in Germany since 2017. It happens that I cannot boast about my teeth, so periodically have to visit dentist not only for check. I found out that when it is going of dentist selection, should be considered next points. 1. Problem you try to solve If you need a filling, I would not suggest dentists which specialize on dental implants. It is not always obvious, because now mostly every dentist does implantation,. You have to gather some information about a dentist among your colleagues and friends. Usually German people do not give their judgements regarding dentists, so you need to ask whether their doctor puts durable filling, or does he do implantation. 2. Dental company infrastructure   Basically you need to find out if a company has own dental technic. If a company has own technic that's the big plus for those who need new crowns (implantation and crowns change).  I wouldn't suggest you to change crowns in a company without technic. Usually they do n...

CsvExampleGen and UnicodeDecodeError

 Common error received when running some components of TFX is UnicodeDecodeError. Why it appears?  Components like CsvExampleGen, receive base_location as path to folder with csv datasets. By design folder should not contain other files then csv. If folder contains other files with very high probability will appear unicode error. More details .

Noise regulation in Germany

 Below I gathered important links for people who are interested in nose regulation: Privater Baulärm während der Ruhezeiten, Rheinland Pfalz Verordnung zur Durchführung des Bundes-Immissionsschutzgesetzes (Geräte- und Maschinenlärmschutzverordnung - 32. BImSchV) Landes-Immissionsschutzgesetz (LImSchG) Vom 20. Dezember 2000

My acquaintance with Flutter.

Preface. As a short disclaimer I would like to give here a story from my life. Back then when I worked for Ericsson, I had a conversation with my colleague. He was so excited about a progress of JavaScript and Node.js at that time, that he was literally convinced that JavaScript in the close future will replace all programming languages.   To justify him I have to say that he was kind of developer who had been developing or testing telecommunication systems during all his life after the university. This of course influenced him as a developer and when he decided to discover IT outside of telecommunication area his conclusions were slightly not correct at the beginning of his journey.   I would ask you not take this presentation as marketing action or a one more statement of the only true way. I would like to ask you to take a look on Flutter framework in impartial way. I hope my induction / preface illustration, and razer skeptical position regarding dominating perspectives of...

Install RabbitMQ in OpenShift

 Created this post just to save and share good article on installation topic. By following this link you can find good how to on installation of RabbitMQ in OpenShift. Here is a template yaml. It worth to mention that  the template above has inbound network traffic restrictions - only inbound calls from the same application allowed. If you will face the issue of unreachable Rabbit cluster, you could try to remove network rules. 

How to add tags to audio files on Mac with Python

Image
I am user of Iphone, and I like to listen different audio books. Sometimes audiobooks have no album tag, so they are not grouped by Apple Music application. To avoid this problem I decided to share my knowledge with others. Fo now it is the video on Ukrainian. I use Mutagen  to do the hard work. The script is pretty simple, such that beginners can use it. from pathlib import Path from mutagen.mp3 import EasyMP3 as MP3 folder = Path().home().joinpath('Documents').joinpath('mp3_Jeeves_and_Wooster') assert folder.exists() tracks = folder.glob('*.mp3') for track in tracks: tag = MP3(track) tag['album'] = 'Jeeves & Wooster' tag['title'] = track.name.split('.')[0] print(tag.tags) tag.save()

Conda config

 The best conda config I have found : channels: - defaults - conda-forge create_default_packages: - pip - ipykernel - nb_conda_kernels ssl_verify: true You should save if in your home folder in .condarc file (crete it if you do not have one). To work with different conda kernels in Jupyter use:  nb_conda_kernels  and ipykernel Use: source activate myenv python -m ipykernel install --user --name myenv --display-name " Python (myenv) "

Flutter presentation

 I have done a presentation of Flutter for BIX front-ent group. I am happy that topic got interest in out developers community.  From my experience with IOS and Android development, I can conclude that Flutter can facilitate and speed up development process.  Presentation can be found on SlideShare: here I think people are interested.

Accelerometer

 Good post on accelerometer