.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "example_gallery/variability_metrics.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_example_gallery_variability_metrics.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_example_gallery_variability_metrics.py:


Variability metrics
===================

This is an example on filtering the sources in a TraP export database to find the most variable sources.
This exmple starts with a TraP export database based on approximately 60 LOFAR 1.0 images at a 2 minute interval.
The first three of these images are available next to the export database file in the repository under tests/data/lofar1.

Let's start by opening the test database.

.. GENERATED FROM PYTHON SOURCE LINES 12-18

.. code-block:: Python


    from trap.io import open_db

    db_path = "../tests/data/lofar1/GRB201006A_60_images.db"
    db_handle = open_db("sqlite", db_path)


.. GENERATED FROM PYTHON SOURCE LINES 20-34

There are two parameters we will use for describing the variability of a source:

 - The **reduced weighted** :math:`\chi ^2 (\eta)`: This is a fit to the light curve.
   The larger the value the less well it fits to a horizontal line and thus the more variable the source is.
 - The **coefficient of variation** (:math:`V`): This is the magnitude of the fuxe density variation in the lightcurve.
   The larger this value, the larger the variation in the flux densidty measurements and thus the more variable the source is.

The sources we are after in this example are those that score high on both of these metrics.

We can calculate these variability metrics by first constructing the lightcurves and then determining the variability within these
lightcurves. This is already done during runtime, where the variability metrics are stored in a table called 'variability'.
This uses all extracted sources, excluding the force fits. In case you want to recalculate these metrics or use only a subset of sources,
you can use the post-processing function :func:`trap.post_processing.construct_varmetric`.


.. GENERATED FROM PYTHON SOURCE LINES 35-40

.. code-block:: Python


    import pandas as pd

    varmetric_table = pd.read_sql_table("variability", db_handle)


.. GENERATED FROM PYTHON SOURCE LINES 41-43

Let's see what this gives us


.. GENERATED FROM PYTHON SOURCE LINES 44-46

.. code-block:: Python

    print(varmetric_table.head())


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

       id  newsource  src_id  ...  nr_datapoints  first_image    first_detection_time
    0   0          0       0  ...             55            0 2020-10-06 01:22:37.500
    1   1          1       1  ...             54            0 2020-10-06 01:22:37.500
    2   2          2       2  ...             55            0 2020-10-06 01:22:37.500
    3   3          3       3  ...             55            0 2020-10-06 01:22:37.500
    4   4          4       4  ...             50            0 2020-10-06 01:22:37.500

    [5 rows x 15 columns]


.. GENERATED FROM PYTHON SOURCE LINES 47-51

From this table we will select only those sources (rows) where v_int i larger than zero.
Now that we have the variability metrics, we can create a plot showing the distribution of
the variablility values.


.. GENERATED FROM PYTHON SOURCE LINES 52-107

.. code-block:: Python


    varmetric_table = varmetric_table[varmetric_table.v_int > 0]

    import matplotlib.gridspec as gridspec
    import matplotlib.pyplot as plt
    import numpy as np
    from matplotlib.font_manager import FontProperties
    from matplotlib.ticker import NullFormatter


    def plot_variability():
        left = bottom = 0.1
        width = height = 0.65
        bottom_h = left_h = left + width + 0.02
        rect_scatter = [left, bottom, width, height]
        rect_histx = [left, bottom_h, width, 0.2]
        rect_histy = [left_h, bottom, 0.2, height]
        fig = plt.figure(1, figsize=(12, 11))
        ax_scatter = fig.add_subplot(223, position=rect_scatter)
        plt.xlabel(r"Recuced weighted $\chi^2$", fontsize=20)
        plt.ylabel("Coefficient of variation", fontsize=20)
        ax_histx = fig.add_subplot(221, position=rect_histx)
        ax_histy = fig.add_subplot(224, position=rect_histy)
        ax_histx.xaxis.set_major_formatter(NullFormatter())
        ax_histy.yaxis.set_major_formatter(NullFormatter())
        ax_histx.axes.yaxis.set_ticklabels([])
        ax_histy.axes.xaxis.set_ticklabels([])
        xdata_var = np.log10(varmetric_table["eta_int"])
        ydata_var = np.log10(varmetric_table["v_int"])
        ax_scatter.scatter(xdata_var, ydata_var, s=10)
        ax_histx.hist(xdata_var, bins=30, histtype="stepfilled", color="b")
        ax_histy.hist(
            ydata_var, bins=30, histtype="stepfilled", color="b", orientation="horizontal"
        )
        xmin = int(min(xdata_var) - 1.1)
        xmax = int(max(xdata_var) + 1.1)
        ymin = int(min(ydata_var) - 1.1)
        ymax = int(max(ydata_var) + 1.1)
        xvals = range(xmin, xmax)
        yvals = range(ymin, ymax)
        xtxts = [r"$10^{" + str(a) + "}$" for a in xvals]
        ytxts = [r"$10^{" + str(a) + "}$" for a in yvals]
        ax_scatter.set_xlim([xmin, xmax])
        ax_scatter.set_ylim([ymin, ymax])
        ax_scatter.set_xticks(xvals)
        ax_scatter.set_xticklabels(xtxts, fontsize=20)
        ax_scatter.set_yticks(yvals)
        ax_scatter.set_yticklabels(ytxts, fontsize=20)
        ax_histx.set_xlim(ax_scatter.get_xlim())
        ax_histy.set_ylim(ax_scatter.get_ylim())
        return ax_scatter, ax_histx, ax_histy


    ax_scatter, ax_histx, ax_histy = plot_variability()
    plt.show()


.. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_001.png
   :alt: variability metrics
   :srcset: /example_gallery/images/sphx_glr_variability_metrics_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 108-111

Since we want to select only the sources that score high on both variability metrics,
we can define a cutoff threshold and select only those sources that are above both thresholds.


.. GENERATED FROM PYTHON SOURCE LINES 112-154

.. code-block:: Python

    sigmaThresh = 1.3
    from scipy.stats import norm


    def SigmaFit(data):
        median = np.median(data)
        std_median = np.sqrt(np.mean([(i - median) ** 2.0 for i in data]))
        tmp_data = [
            a
            for a in data
            if a < 3.0 * std_median + median and a > median - 3.0 * std_median
        ]
        param1 = norm.fit(tmp_data)
        param2 = norm.fit(data)
        return param1, param2


    paramx, paramx2 = SigmaFit(np.log10(varmetric_table["eta_int"]))
    paramy, paramy2 = SigmaFit(np.log10(varmetric_table["v_int"]))
    print(
        "Gaussian Fit eta: "
        + str(round(10.0 ** paramx[0], 2))
        + "(+"
        + str(round((10.0 ** (paramx[0] + paramx[1]) - 10.0 ** paramx[0]), 2))
        + " "
        + str(round((10.0 ** (paramx[0] - paramx[1]) - 10.0 ** paramx[0]), 2))
        + ")"
    )
    print(
        "Gaussian Fit V: "
        + str(round(10.0 ** paramy[0], 2))
        + "(+"
        + str(round((10.0 ** (paramy[0] + paramy[1]) - 10.0 ** paramy[0]), 2))
        + " "
        + str(round((10.0 ** (paramy[0] - paramy[1]) - 10.0 ** paramy[0]), 2))
        + ")"
    )
    sigcutx = paramx[1] * sigmaThresh + paramx[0]
    sigcuty = paramy[1] * sigmaThresh + paramy[0]
    print("eta threshold = " + str(round(10.0**sigcutx, 2)))
    print("V threshold = " + str(round(10.0**sigcuty, 2)))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Gaussian Fit eta: 2.27(+4.3 -1.49)
    Gaussian Fit V: 0.17(+0.16 -0.08)
    eta threshold = 9.04
    V threshold = 0.4


.. GENERATED FROM PYTHON SOURCE LINES 155-157

Let's make the same plot again, but show the cutoff line we just calculated.


.. GENERATED FROM PYTHON SOURCE LINES 158-166

.. code-block:: Python


    ax_scatter, ax_histx, ax_histy = plot_variability()
    ax_histx.axvline(x=sigcutx, linewidth=2, color="k", linestyle="--")
    ax_histy.axhline(y=sigcuty, linewidth=2, color="k", linestyle="--")
    ax_scatter.axhline(y=sigcuty, linewidth=2, color="k", linestyle="--")
    ax_scatter.axvline(x=sigcutx, linewidth=2, color="k", linestyle="--")
    plt.show()


.. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_002.png
   :alt: variability metrics
   :srcset: /example_gallery/images/sphx_glr_variability_metrics_002.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 167-173

To see what the sources look like, let's select the sources from the varmetric_table that
are above both thresholds and plot the lightcurves next to the image. In the image we can
plot the location of the source. Since the source is fitted in every image the location
can vary a bit per image. We could plot the mean position, but it is also informative to
plot all locations the source was found to view the spread.


.. GENERATED FROM PYTHON SOURCE LINES 174-298

.. code-block:: Python


    import os
    from pathlib import Path

    import astropy
    from astropy.io import fits
    from astropy.wcs import WCS
    from matplotlib.patches import Ellipse
    from pandas import read_sql_query, read_sql_table

    from trap.post_processing import construct_lightcurves

    image_size = 150  # How many pixels to plot of the image. A larger size here means more zoomed out

    # Select the sources above both cutoffs
    variables = varmetric_table.loc[
        (varmetric_table["eta_int"] >= 10.0**sigcutx)
        & (varmetric_table["v_int"] >= 10.0**sigcuty)
    ]

    lightcurves = construct_lightcurves(db_handle, attribute="int_flux")
    lightcurves_err = construct_lightcurves(db_handle, attribute="int_flux_err")
    is_force_fit = construct_lightcurves(db_handle, attribute="is_force_fit")
    images = read_sql_table("images", db_handle)

    image_handle = fits.open(
        "../tests/data/lofar1/GRB201006A_final_2min_srcs-t0000-image-pb.fits",
        memmap=True,
    )[0]
    wmap = WCS(image_handle.header, naxis=2)

    # Compute percentiles for plotting on subset of data
    sample = image_handle.data[0, 0, ::20, ::20]
    vmin, vmax = np.nanpercentile(sample, [2, 99])

    nr_frequency_bands = len(lightcurves)
    for src_id in variables.src_id.unique():
        fig = plt.figure(figsize=(16, 4 * nr_frequency_bands))
        gs = gridspec.GridSpec(nr_frequency_bands, 2, width_ratios=[2, 1])

        axes = []

        for band_id in range(nr_frequency_bands):
            if band_id == 0:
                ax = fig.add_subplot(gs[band_id, 0])
            else:
                ax = fig.add_subplot(gs[band_id, 0], sharex=axes[0])
            axes.append(ax)

            acquisition_times = images[images.freq_band == band_id].taustart_ts.values

            ax.plot(
                acquisition_times,
                lightcurves[band_id].loc[src_id],
            )

            for im_id, im_name in enumerate(lightcurves[band_id].columns):
                ax.errorbar(
                    acquisition_times[im_id],
                    lightcurves[band_id].loc[src_id].to_numpy()[im_id],
                    yerr=lightcurves_err[band_id].loc[src_id].to_numpy()[im_id],
                    fmt="o",
                    markersize=3,
                    linestyle="-",
                    color=(
                        "r" if is_force_fit[band_id].loc[src_id].to_numpy()[im_id] else "b"
                    ),
                )

            ax.axhline(y=0.0, color="k", linestyle=":")
            ax.set_ylabel("Flux density (Jy)")
            ax.set_title(f"Frequency band {band_id}")

            if band_id != nr_frequency_bands - 1:
                ax.tick_params(labelbottom=False)
            else:
                ax.set_xlabel("Time")
                ax.tick_params(axis="x", labelrotation=90)

        # Right panel spans all rows
        ax2 = fig.add_subplot(gs[:, 1], projection=wmap)
        with db_handle.connect() as db_conn:
            source_query = (
                f"SELECT ra,dec,src_id FROM extracted_sources WHERE src_id == {src_id};"
            )
            sources = read_sql_query(source_query, db_conn)

            total_nr_sources = sources["src_id"].max()
            for i in range(0, int(total_nr_sources) + 1):
                im = ax2.scatter(
                    sources.ra,
                    sources.dec,
                    marker="o",
                    facecolors="none",
                    edgecolors="red",
                    linewidth=2,
                    transform=ax2.get_transform("fk5"),
                )
                median_ra = np.median(sources.ra)
                median_dec = np.median(sources.dec)
                px, py = wmap.wcs_world2pix(median_ra, median_dec, 1)
        half = image_size // 2
        x_min, x_max = int(px - half), int(px + half)
        y_min, y_max = int(py - half), int(py + half)

        ra_min, dec_min = wmap.wcs_pix2world(x_min, y_min, 0)
        ra_max, dec_max = wmap.wcs_pix2world(x_max, y_max, 0)
        cutout = image_handle.data[0, 0, y_min:y_max, x_min:x_max]
        ax2.imshow(
            cutout,
            cmap="gray_r",
            vmin=vmin,
            vmax=vmax,
            origin="lower",
            aspect="auto",
            extent=[ra_min, ra_max, dec_min, dec_max],
        )
        ax2.coords[0].set_format_unit(astropy.units.deg)
        ax2.coords[1].set_format_unit(astropy.units.deg)
        ax2.set_xlabel("Right Ascension (deg)")
        ax2.set_ylabel("Declination (deg)")
        ax2.set_title("Source location")
        plt.show()


.. rst-class:: sphx-glr-horizontal


    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_003.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_003.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_004.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_004.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_005.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_005.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_006.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_006.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_007.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_007.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_008.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_008.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_009.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_009.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_010.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_010.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_011.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_011.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /example_gallery/images/sphx_glr_variability_metrics_012.png
         :alt: Frequency band 0, Source location
         :srcset: /example_gallery/images/sphx_glr_variability_metrics_012.png
         :class: sphx-glr-multi-img


.. GENERATED FROM PYTHON SOURCE LINES 299-309

When observing these plots, several of them look like there is no source at the red markings.
This is not a fluke. In these cases there was a source but since we just show the first image,
the source was not present at that locaton in the first image. In most of these cases,
the source moved (maybe satellite?) and can sometimes still be seen in the image but at a different
location. This becomes more obvious when inspecting the zoomed in plot for multiple images,
but we cannot show that in this example because we only have access to a few of the input images here
to save space. I encourage you to run this example script on data you ran on your own machine and
see if you can identify the image the source was brightest at (using the im_id) and plot that image
instead of simply the first image in the batch like we do here.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 12.090 seconds)


.. _sphx_glr_download_example_gallery_variability_metrics.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: variability_metrics.ipynb <variability_metrics.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: variability_metrics.py <variability_metrics.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: variability_metrics.zip <variability_metrics.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_