Quick start
To run this python package it is recommended to first create a virtual environment using virtualenv or conda. The package can then be installed through:
>>> pip install trap
Once installed you should have access to the command line utilities trap-run and trap-view.
The former is for processing the images and creating a database with sources. The latter is a debug
utility for visualizing the data in the expored databsae.
>>> trap-run --help
>>> trap-view --help
The parameters used by TraP are stored in a .toml configuration file. The default file can be found here: https://git.astron.nl/RD/trap/-/blob/main/trap_config.toml?ref_type=heads
These parameters can be modified either in that file or via the command line, e.g. trap-run --detection-threshold=5.
You can use command-line parameters to override settings in the configuration file, because they take precedence.
If the configuration file is not found by TraP, you can point to it by
>>> trap-run --config_file /path/to/config/file
By default TraP will look for the configuration file in the folder where you are running. Specifying the location of the config file is also handy if you have several versions of the configuration in the folder and you want to use one of your presets.
The input files can be supplied using --input_images or -i.
This can refer to either a file, directory or glob pattern (e.g. ‘images/my_image_*.fits’).
When using a glob pattern, remember to wrap the line in quotes or the terminal might get confused.
If a directory or glob pattern is used, all fits images found there will be used.
If a nested directory is supplied, the subdirectories will also be searched for fits files.
These arguments can be supplied multiple times to refer to multiple files or locations.
An example is:
>>> trap-run -i images/day1/ -i images/day3/specific_file.fits -i 'images/day2/some_other_*_files.fits'
The result will be a combination of all images of day1, and a subset of the images in day2 that match the * pattern and the specific file from day 3. The fits files will be sorted on observation start time so the order in which they are supplied does not matter.
Using multiple CPU cores
TraP uses Dask to process multiple images at once.
This significantly speeds up the code. For more information on this, see the input parameters
--nr_threads and --scheduler described in Input arguments.
By default, Dask’s scheduling is very eager, by which I mean that Dask will try to put any idle thread
to work, even if the work it does is not immediately required. Often this works out well, but in the
case of TraP this means that Dask will read ever more images, overflowing RAM. To prevent this,
we limit the number of images that can be processed at a time through the --max_concurrent_images
argument. Even if there are 42 threads available, if --max_concurrent_images is set to 12,
no extra images are read in until these 12 are finished. There is of course more work to do than
reading images, like force fitting, associating, etc., so we will likely make use of more than the
12 threads used for reading the images. You can specify --max_concurrent_images yourself if
you so desire, but it is recommended to keep it on the default setting. By default the maximum
number of concurrent images is determined automatically by the size of the largest images
specified for this run in relation to the amount of available RAM.
Viewing a progress dashboard
When the --scheduler=distributed setting is used, a dashboard is created.
This dashboard shows information on the progess and resource usage.
The location of the dashboard is printed to the terminal at the start of the run,
but the default location is http://127.0.0.1:8787/status.
If you run TraP on a different machine, you have to either connect to that machine
with a VNC and open the url in a browser in the VNC, or you can create a tunnel
such that you can view it on the browser on your own machine.
Creating a tunnel on ubuntu usually looks something like: ssh -L 8787:localhost:8787 -N user@machine.server.com
Here I mapped port 8787 on my machine to that of the other machine. If this does not work for you,
check the TraP stdout logs (terminal output) for the port used on the machine it was running on.
Note
The dashboard is only live while the program is running
Running the Project in the Container
Every release, starting from v1.3.0, a docker container is build on release. These are hosted here: https://git.astron.nl/RD/trap/container_registry. You can run trap-run inside the container on your local data. Use a bind mount to make local files available inside the container. This is needed to point to both your input images and optional configuration files.
Example:
docker run --rm \
-v "$PWD":/"$PWD" \
git.astron.nl:5000/rd/trap:latest \
trap-run -i "$PWD/tests/data/lofar1" --config-file "$PWD/trap_config.toml" \
--db-name "$PWD/trap_sources.db"
Explanation:
–rm removes the container after it finishes.
-v “$PWD”:/”$PWD” mounts your current directory into the container.
git.astron.nl/RD/trap/container_registry:latest points to the container with the most recent version of TraP.
trap-run -i ./tests/data/lofar1 –config-file trap_config.toml is the command executed inside the container. Adjust paths and arguments as needed for your data and configuration.
–db-name “$PWD/trap_sources.db” specifies the output location of the sqlite database to be in the mounted directory
It is good practice not to run docker as sudo. To avoid this, add your user to the docker group, see: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user