## Tuesday, October 11, 2016

### Schlieren imaging system

Hello and welcome to another blog post. And it is not going to be neither about genetic algorithms or programming nor about space related stuff. Today's blog post is going to be about a nice physics phenomenon called schlieren effect and it's application for visualising air flows. There is going to be a lot of pictures and if you get to the end, a video is waiting for you. Pleasant reading!

# What is the schlieren effect?

The word "schlieren" comes from the German language and is, in fact, a plural of a word "schliere" which means "streak", so schlieren means streaks. And streaks are what all this is about, but let's call it schlieren. The schlieren in question are schlieren in gas. For example, when there is a heat source, the air around this source heatens up which makes it less dense which means that it also refracts light differently than the surrounding cold air. That leads to schlieren produced by light passing thourgh this non-homogeneous environment. You can see this effect every summer - look over a roof of a car that has been standing in sunlight for some time. You are going to see the image from behind the car subtly shaking as the hot air rises from the roof.

# What is the schlieren photography, how does it work?

Except for such extreme cases like hot air rising from the black car in a summer day, the schlieren are often next to invisible for naked eye. For example, when you light a candle, you can't see much more than the flame itself. Another example might be an air stream from a haidryer. However, with a proper schlieren photography setup, you can visualise these subtle differences.

There are several ways of visualising the tiny differences in air (gas) density. Here I'm going to focus on the probably simplest and cheapest one which is the single mirror setup. This setup requires three key components:

• a point light source,
• a concave spherical mirror (parabolic can also work) and
• a sharp edge.
The mirror must have a focal length (half its curvature radius in case of spherical mirror) in the order of meter(s).

Now, the best way to illustrate how does it actually work is to show it with a picture:

As you can see, the point light source is placed at double the focal length from the mirror and a little bit off-axis. Double the focal length is the radius of the mirror's curvature. If you put something in the center of a reflective sphere, its image gets reflected back to itself. In this case the point gets reflected to the opposite side of the optical axis back to a point (technically, because it is off-axis, it does not focus exactly into a point but the difference is so small that it can be ignored). The sharp edge, e.g. a razor's edge, is placed so that it goes through exactly the focused point of light. A camera is placed behind the edge so that it looks straight into the mirror through the focused point at the edge. The testing area is in front of the mirror.

Normally, when there is no disturbance in the path of the light, it is just focused back to a point and the camera captures a uniform brightness across the mirror. However, when there is even a small disturbance in the air (gas) density, it very slightly bends the light. It gets reflected from the mirror and the slight disturbance gets magnified as it travels towards the focus point. Since the light is diverted from its original path it's going to be out of focus. Some of this distubed light will get blocked by the razor, causing a dark spot in the image, and the other part will "dodge" the razor, causing a bright spot in the image.

When everything is set up and you light a candle in front of the mirror, it might look like below. This is the very first image I captured on my schlieren imaging setup:

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

# My schlieren setup

My schlieren setup and the making of it is heavily inspired by this great guide, so everything that is said there more or less applies to my setup as well.

## The mirror

The most expensive and the most important component is the mirror. I have a primary telescope mirror, spherical, 160 mm in diameter and with a focal length of 1400 mm. That means that the light source and the edge is 2800 mm from the mirror. I got my mirror at eBay from optics21406 for 69.99 \$. I can say that the mirror is great, very precise.

The mirror looks like this:

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

It's a nice and robust piece of glass. It weighs about a kilogram! Quite a lot for such, one would say, a small mirror :).

Regarding the mirror, if you want to build your own schlieren system, forget about using your magnifying bathroom mirror. These are VERY imprecise. You need to get a telescope grade mirror. The focal length must not be too short. Otherwise the light might not get enough space to be diverted enough to be blocked by the razor. Also, the size matters. Much. The bigger the mirror, the bigger the testing area is going to be. So forget about laser optics - these are very small (since laser beams tend to be very narrow).

The mirror mount is very homemade :). Sticking to the guide I mentioned above, it's made of cardboard, specifically from the box the mirror arrived in:

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

## The light source and the edge

I use a super-bright white LED (3.4 V, 20 mA, I forgot the luminosity), shining though a piece of aluminium foil with a small hole in it. I power it by a 9V battery through a 270Ω resistor. The edge is just a simple razor edge. I have it all "mounted" and taped on a piece of polystyrene which itself is mounted (taped) on a tripod. It looks like this:

Fotka zveřejněná uživatelem Jan Žegklitz (@gandalvv),

To be honest, this image was taken before I set the camera up. After I had set it up, I had to cover the LED from behind so that it hasn't produced parasite light.

## The result

Finally, I recorded a video of some stuff in front of the mirror. To be specific, you will see a candle being lit up and moved around, a hairdryer blowing on its own and also on an obstacle (my hand), and also the famous chemical reaction of baking soda and vinegar, producing some carbon dioxide (which is heavier than air). Enjoy!

## What can be improved

My current setup is far from perfect and there are some things that can certainly be improved.

### The light source

The super-bright LED is very fine, but the other stuff is not. I think I could make the hole in the aluminium foil even smaller so I would get better resolution of the disturbances. Also, as you saw it in the photo above, the backwards reflection from the aluminium foil shined a lot into my camera's lens and I had to additionally cover it with a spare lens cover. The best would be to just make some nice box with a switch, proper place for battery, a tripod thread for secure mounting and maybe some additional screws for precise alignment.

### The mirror

The mirror is amazing! I could, of course, use bigger but that would be at a completely different budget level... But back to the troubles. It is terribly easy to get it dirty and it was. For telescope usage it is not a big deal as there the whole mirror makes up the whole image "collectively", so even with quite a significant amount of dust (judging by eye) it can still produce crystal clear images. Here, however, every spot on the mirror makes its own part of the image. And cleaning a telescope mirror is not an easy task if you really don't want to damage the protective and reflective coating. I tried somewhat and kind of failed. Luckily, it was the good type of failure - I failed at removing the dirt, not scratching the mirror.

### The razor

In the video you could have seen two phenomena. The first one is that when the hairdryer was blowing horizontally it was much less visible than when it was blowing vertically. This has to do with the orientation of the razor. If the razor was horizontally aligned it would have been vice versa. The second one is that sometimes, e.g. when I put something in front of the mirror and it shaked, some of the stuff in the mirror that seemed like dirt shaked too. That means that it was not a dirt on the mirror but some other abberation somewhere else. I suspect that it could be the razor.

Another thing I would like to try is to replace the opaque razor with some color filter. The effect would then be colored instead of black and white. It could be quite nice :). Another idea that occurred to me was to replace the razor with yet another pinhole (about the same size as the one in front of the LED) or with two razors perpendicular so that any disturbancea in all/both directions would be pronounced. But I'm not very sure about this.

### Other stuff

As you could have seen, the colors in the video are wildly red. That's because of me screwing up the white balance. I also had a lot of troubles with stability of the setup as I have set it up on carpets so the razor was sliding to one side all the time. Also, my tripods are far from ideal. The one with the light source is no big deal but my camera tripod is a mess. It's very old, some pieces are broken and it's not very stiff, which is kind of a problem when you need to align the camera precisely :). Some intermediate piece between the tripod and the camera, with extra collimation screws for precise alignment would be awesome but that is a pie in the sky for now.

I hope you enjoyed this first experiment. I will do more of them and better and I will report about it here. If you have any comments or questions, feel free to leave them below.

## Sunday, September 11, 2016

### ESA Citizen's Debate 2016

Yesterday, on Saturday, 10th September 2016, I took part in the first ever Citizen's Debate about space related stuff organized by ESA, the European Space Agency. This debate took place in all of the 22 member countries of ESA.

Each country had up to 100 people participating in the debate. Since I'm Czech and live in the Czech Republic, I was at Prague, the capital of Czech Republic. The Prague debate was organized by the Technology Centre of the Czech Academy of Sciences and was held at the Masaryk dormitory. In order to be able to participate in the debate, one had to apply for it through a web form where he or she stated, among other information, some demographic data - age, job, region of residence, etc. The debate organizers then selected from the pople who applied to cover the broadest part of the demographic spectrum as possible. I think this was successful as I have seen people of all genders, ages, shapes and sizes :).

The goal of the debate was for ESA to get a glimpse at what ordinary people think and feel about space in general and about ESA and what it is doing. The organisation of the debate was following. The people were divided into small groups of about 8 people (at least in Prague) who were sitting around a table and debating the topics. The debate was divided into 5 distinct parts, called sequences. Each sequence was dedicated to some topic. For each sequence, each participant received a questionaire with a bunch of questions. The people around the table then debated over the questions in the questionaires. More, there were facilitators, people who directed and constrained the small debates to keep on the topic and who also extracted and written down some interesting ideas and opinions that came up in the debates. However, the questionaires were filled individually. In addition to the 5 sequences, there were two more questionaires (at the beginning and at the very end) which were to be filled completely individually, without any discussion.

There were three big topics which were closely watched and written down (the essence of the discussion) by the facilitators. The first one was the issue of space debris and how to handle it. The second big topic was ESA financing. The last one was the future of space exploration in the form of what would we like, if we had the power.

Overall, the debate was interesting and, as Johann-Dietrich Wörner, the Director General of ESA, said, the first ever such debate in the history. ESA said that it would listen to the debate results very closely. The results of the debate (the first data are already coming int) and more info about the whole thing can be found at www.citizensdebate.space. The materials used in the debate should be available soon for everyone in the "Replica debate" section.

## Tuesday, August 02, 2016

### Symbolic Regression and Multi-Gene Genetic Programming

I'm back with my blog being about something useful. Sorry for the long waiting, I was really really busy. Today's topic will be about a class of techniques for speeding up Genetic Programming algorithms in Symbolic Regression scenarios.

# Regression vs. Classification

First of all, let me do a quick introduction/clarification regarding regression and classification. These two are machine learning tasks, similar in part, but different in the end. In both tasks we have a bunch of objects each described by a set of numbers (not necessarily but for the sake of simplicity numbers will do great).

In classification the goal is to assign a class to an object based on the numbers it is described by. The class can be any discrete value. Examples are: true/false, red/green/blue, high/middle/low, 1/2/3/4/5/6/7/8/9/10 rating, or any label. The special case with just two classes also has a special name binary classification. Typical classification problem is spam detection - is an e-mail spam or ham?

In regression the goal is to assign a numerical value to each object, again based on the numbers describing that object. Generally, these assigned numbers can have any value. Typical regression problems are various prediction and estimation problems. For example, suppose there is some complicated machine and you want to predict fuel consumption based on some operational data.

## Symbolic Regression

Symbolic Regression is a class of tasks for GP systems. Actually, it is the very first example I mentioned in Evolution in a Computer, pt. 3 - we have a bunch of data points and we want to find a function that fits them accurately (enough). It doesn't matter where the data points come from, be it anything from a demonstrative problem as in that blog post to data from medical scans where we need to asses a patient's condition. Whenever we want to find a function fitting some data, it is symbolic regression.

The word symbolic is quite important. It means that the output model is a mathematical formula, that could be, in principle, read and interpreted by a human, or, for example, be given to a computer algebra system (CAS) to compute derivatives, find roots, etc. This is in contrast with, e.g., neural networks which is characterized by its topology (how many neurons are there and how are they connected) and the values of the weights (the degrees of how pairs of neurons are connected). That is not something one could look at and see a pattern nor could be given to some CAS even though it still does regression.

So, from now on, let's talk about Symbolic Regression, or SR in short.

# Problem of SR

If you played around with the SR example from the already mentioned previous blog post, you might have noticed that it does not work very well. In the better case it just took quite a long time to get to some sensible function. In the worse case it failed completely and got stuck with some obscure function.

There are several reasons why this does not work that well. I would like to write about one of them and that is fine-tuning the parameters. Vanilla GP works with a set of terminals and non-temrinals and evolves functions by juggling around with pieces of mathematical expressions to hopefully get meaningful results. However, this juggling is often also the only mechnism of how to get some specific numbers in the final function. Imagine, that the true function (i.e. what we would like to find by evolution) is $$f(x) = \sin(3.18x - 6.57)$$. However, the GP algorithm is very unlikely to have the 3.18 and 6.57 numbers in its terminal set. It must mainpulate what it has to get (close to) these numbers.

# Multi-Gene Gentic Programming

The idea of Multi-Gene Genetic Programming, or MGGP for short, is to decompose the problem into more, presumably simpler, subproblems. However, the decomposition is not exact, as (almost) nothing around GP is. MGGP addreses part of the problem with finte-tuning parameters.

## How MGGP works

In classical GP, each individual is one tree which is also the solution to the problem, in case of SR it is the expression which is supposed to describe the data and its output is the regression estimate. In MGGP, in contrast to GP, each individual is made of multiple trees which together make the solution. These trees are called genes.

But how do we get a single number out of multiple expressions? The answer is linear combiantion. That means that the output of every gene is multiplied by some constant number and then they are all summed up together. It is basically a weighted sum of the genes. More, in MGGP, there is always an implicit gene which is always equal to the number 1. The reason for such gene is that we get an additive factor for free.

Example: let's have these three genes and (for simplicity) there is only one variable to work with, $$x$$: $$g_1(x) = x$$ $$g_2(x) = \sin(x)$$ $$g_3(x) = \cos(x)$$ and of course there is always the implicit gene of 1, let's give it the index of 0: $$g_0(x) = 1$$. Now, to get the final expression, there are some constants, let's call them $$w_i$$ which multiply the individual genes. The final expression would be, in this case: $$f(x) = w_0 g_0(x) + w_1 g_1(x) + w_2 g_2(x) + w_3 g_3(x)$$ or to make it simpler (because $$g_0(x)$$ is always going to be 1): $$f(x) = w_0 + w_1 g_1(x) + w_2 g_2(x) + w_3 g_3(x)$$

By now, you should have an idea but you should be wandering: Where do we get the constants ($$w_i$$) from? Yes, that is the key for MGGP. And the trick is really simple. We use the power of linear regression. Linear regression is a kind of regression analysis (see, SR is another kind of regression analysis) which produces models in the form of linear combination of the input features. So, for example, if our problem has two input features, the linear regression produces models like $$f(x_1, x_2) = w_0 + w_1 x_1 + w_2 x_2$$.

There are several ways of how to get the constants $$w_i$$ but probably the most widely used is the ordinary least-squares solution. Let the input features be arranged in a matrix $$\mathbf{X}$$ so that each datapoint (its input features) is one row of the matrix and let the target values be arranged in a column vectory $$\mathbf{y}$$ so that $$i$$-th element corresponds to the $$i$$-th row of the matrix $$\mathbf{X}$$. If we have these, the vector of the constants, also called weights, can be computed as:

$$\mathbf{w} = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}$$
This way we get the weights that multiply the input features but we don't get the additive factor, $$w_0$$. This is fixed easilly by prepending a column of ones to the matrix $$\mathbf{X}$$.

Identical procedure is applied in MGGP. Instead of using the vanilla input variables, we first compute “new” features by computing the outputs of the genes - each gene produces one new feature - and of course create the implicit feature of 1. Important fact is that the computation of the weights is done at evaluation, i.e. they are always computed optimally with respect to the genes that are available in the individual.

Since there are multiple genes, the crossover operator is split into two types:

• Low-level crossover - this is a classical crossover known from GP: a random node is chosen in each individual and the subtrees rooted in these nodes are swapped.
• High-level crossover - a new type of crossover arising from the fact that there are multiple genes: whole genes (one or more) are swapped between the individuals.

## What is it good for

The major benefit of this approach is that it lifts the duty of finding the linear parts from the shoulders of the underlying GP which needs to focus on the non-linear parts only. Also, it makes a kind of decomposition of the problem - multiple, maybe smaller, genes work together to create a reasonably good solution. It addresses the problem of fine-tuning the parameters to a ceratin degree - the top-level parameters, i.e. the linear coefficients, are not needed to be found by evolution. However, the constants hidden inside the non-linear parts are not adressed by this mechanism and are sought traditionally (in GP terms).

# References

• GPTIPS - the leading implementation (and maybe the only one really used) of (not just) MGGP by Dominic Searson.
• Hinchliffe MP, Willis MJ, Hiden H, Tham MT, McKay B & Barton, GW. Modelling chemical process systems using a multi-gene genetic programming algorithm. In Genetic Programming: Proceedings of the First Annual Conference (late breaking papers), 56-65. The MIT Press, USA, 1996.
• Searson DP, Willis MJ & Montague GA. Co-evolution of non-linear PLS model components, Journal of Chemometrics, 2, 592-603, 2007.

## Monday, January 18, 2016

### Quickie: zegkljan.net

Hi folks, this is just a quick one. I have finally set up my own domain zegkljan.net. So, from now on, you can find my blog at blog.zegkljan.net. The domain itself currently redirects to the blog but in the future I want to have some other stuff there as well. Of course, the original address zegkljan.blogspot.com still works and redirects to the blog.