`.)
## Paths preparation
Before running anything, you need to set the correct paths and parameters.
In the ``paths_configuration.json``:
- Add the tile code linked to the location you want to add
- Create the output directory for ALCD, and set its path in the "data_alcd" variable
- Set the correct paths for the L1C directory and the DTM_input.
In the global_parameters.json, if you use a distant and a local machine, set the ``local_paths`` variables accordingly.
## Step 1
A good practice is to visualise the two dates we want to use beforehand. This can be
facilitated by the code *quicklook_generator.py*, which generates quicklooks for a given location.
The user can therefore make sure that the cloud-free image is indeed cloud-free, and that the
image to be classified is interesting.
Therefore, initialize the environment by running :
```bash
python all_run_alcd.py -f True -s 0 -l city_name_dir -d cloudy_date -c clear_date -kfold False -global_parameters path_of_global_parameters.json -paths_parameters path_of_paths_configuration.json -model_parameters path_of_model_parameters.json
```
This will create the concatenated .tif with all the bands, and empty shapefiles for each class,
among other things. It invites you to copy those created files to your local machine, to accelerate the process
in QGIS (on our processing computer, visualisation is slow, so we use QGIS on a different
computer). You can also modify the files directly, in this case, you can skip the manual copy
of the files and go to Step 2. Otherwise, copy the files on your machine with QGIS, and go to
Step 2.
## Step 2
You can now open QGIS. Open the raster ``In_data/Image/city_name_bands_H.tif`` (H stands for
Heavy, as it is in full resolution of 20m per pixel), and ``In_data/Image/city_name_bands.tif``.
The ``city_name_bands_H.tif`` bands refer to the band 2 (blue), 3 (green), 4 (red), 10 (the band at
1375nm), the NDVI and the NDWI. The bands for the ``city_name_bands.tif`` are quite numerous,
but the content of each band is documented in the .txt file corresponding to each .tif.
Now, adjust the style in QGIS such that you see the image in true colors. For that, you
can load the file ``color_tables/heavy_tif_true_colors_style.qml`` on the Heavy .tif. You should get :
Figure 2: QGIS window with the scene displayed in true colors
Now, load all the empty shapefiles from the directory ``In_data/Masks``.
If you display a band being a time difference (for example the 20th band of ``city_name_bands.
tif``), you will observe that there was no data on the bottom-right corner for the clear date.
The same is true with the top-left corner for the cloudy date.
Thus, the no_data file already has some data (which is the case if one or both of the original
images have no_data pixels). As you can see, on ``Figure 2`` the top-left and bottom-right corners
are covered by the no-data mask. If you are not satisfied with the mask, you can edit it manually.
You should get something along the lines of the following :
Figure 3: No-data areas are automatically computed
This no-data layer is used to discard the areas under it, be it for the classification, or if the
user add samples in these areas by mistake.
## Step 3
It is now time to edit the masks layers. For each class (land, low clouds, etc), edit the corresponding
layer. Add the points that you want to take as samples, by clicking on the image and
pressing Enter for each point. We have found more efficient to use points rather than polygons,
we later dilate the points by 3 pixels assuming the neighbourhood is homogeneous in terms of
class, so you should avoid to use a pixel just at the edge of a feature (cloud, land).
The high clouds can be visible with the 1375nm band (i.e. the band number 4 of the heavy
.tif). You can load the style ``heavy_tif_clouds_green_style.qml`` to see them quickly.
The ``Figure 4`` shows the image with the high clouds highlighted, and the steps to add points.
Figure 4: Steps to add data points with QGIS
```{note}
The ``1375nm`` band is not to be trusted blindly. The principle of this band is that the
water vapour in the atmosphere usually absorbs the photons in this wavelength. However, in
dry conditions, or with high altitudes terrains (such as mountains), the photons can be reflected
back. This can be misleading, so the user should take precautions. A typical way to detect such
artefacts is to see if the potential cirrus shape is strongly correlated with that of the underlying
terrain.
```
You can now go back to true colors, and continue by editing all the wanted classes. The
background class can be used if you do not want to discriminate between land and water for
example, but its use is not recommended.
At the end, you obtain ``Figure 5``
Figure 5: Samples placed manually after the first iteration
## Step 4
Now, copy back the edited masks to the distant machine, or skip this if you work on one machine.
It is time to train the model, and classify the image! Do it with :
```bash
python all_run_alcd.py -f True -s 1 -l city_name_dir -d cloudy_date -c clear_date -kfold False -global_parameters path_of_global_parameters.json -paths_parameters path_of_paths_configuration.json -model_parameters path_of_model_parameters.json
```
The results can be seen in the Out directory. The regularized classification map is labeled_
img_regular.tif. You can also see the contingency table in the Statistics directory.
As you can see on the classification map, figure 2.8, some pixels are not well classified.
Moreover, the confidence is low in numerous places, as seen on ``Figure 6``. Therefore, we will take part of the
advantage of this program: the active learning.
Figure 6: Result of the first classification
Figure 7: Confidence map of the first classification
## Step 5
Do an new iteration, by running :
```bash
python all_run_alcd.py -f False -s 0 -l city_name_dir -d cloudy_date -c clear_date -kfold False -global_parameters path_of_global_parameters.json -paths_parameters path_of_paths_configuration.json -model_parameters path_of_model_parameters.json
```
It will save the previous iteration, and you can now edit the class layers, by adding new
points (and also remove some if you made an error previously). You can copy on your local
machine the outputs of the previous iteration (the bash command is given when you run the
command above).
We suggest to open the files ``Out/contours_labels.tif`` and ``Out/labeled_img_regular``.
tif, and to apply to them the ``contours_labeled_contrasted_style.qml`` and the ``labeled_img_regular_style.qml``
styles respectively. It gives each class a recognisable color, which are given in ``table 1``.
Table 1: Available classes and colors
For example, you can display the contours of the classes to see were the classifier was wrong.
Here, we obtained a false detection of *clouds shadows* on the left of the image, which can be
seen with the yellow contours:
Figure 8: Contours of the shadows, in yellow
Therefore, we will add some points of land class in this region to increase the accuracy of
our output, as shown in ``Figure 9``.
Figure 9: Some land samples are added where the wrong classification is visible
Do this for the areas where a misclassification is visible.
Once the wanted points in each class have been added, you can copy back the layers to the
distant machine with the appropriate command.
Finally, you run once again the training and the classification with :
```bash
python all_run_alcd.py -f False -s 1 -l city_name_dir -d cloudy_date -c clear_date -kfold False -global_parameters path_of_global_parameters.json -paths_parameters path_of_paths_configuration.json -model_parameters path_of_model_parameters.json
```
## Step 6
Repeat the Step 5 until you are satisfied with the classification the ALCD algorithm returns.
Quick tip: some data (30% by default) are used for the validation of the model, i.e. just
to compute statistics. If you want to have more samples that you add manually to be taken
into account for the training part, you can increase the training_proportion in the
``global_parameters.json``.
Here is an example of the classification that you could obtain after each iteration. The 6th
one is considered to be good (by myself), so you can stop there.
Figure 10: Evolution of the classification
As a reference, the QGIS windows at the last iteration with all the samples, with the labeled
classification, and with the confidence map, are given in ``Figures 11, 12 and 13``.
Figure 11: All samples present for the last iteration
Figure 12: Labeled classification as seen in QGIS window for the last iteration
Figure 13: Confidence map as seen in QGIS for the last iteration