tropy.analysis_tools¶
The tropy.analysis package is a collection of tools for dedicated analysis of scientific data. It has been especially used to explore atmospheric data, but the application to other types of data might be rather straight forward and meaningful.
tropy.analysis_tools.derived_variables¶
This module contains / will contain variables derived from some basic input fields. The name “basic” refers to the typical set of variables stored in climate or weather model output.
-
tropy.analysis_tools.derived_variables.
calc_lwp
(p, T, qv, qc, g=9.8, axis=-1)¶ Calculates the liquid water path.
Parameters: - p (numpy array) – atmospheric pressure [hPa]
- T (numpy array) – temperature [K]
- qv (numpy array) – water vapor mixing ratio [kg / kg]
- qc (numpy array) – liquid water mixing ratio [kg / kg]
Returns: lwp – liquid water path [g / m**2]
Return type: numpy array
Notes
- hydrostatic approximation is used to convert vertical integration from pressure to height levels
- impact of condesate mass on air density is ignored.
tropy.analysis_tools.grid_and_interpolation¶
-
tropy.analysis_tools.grid_and_interpolation.
COSMO_mean_vertical_levels
(Nlev)¶ Outputs the mean vertical levels of COSMO-DE taken from the model description of COSMO (see reference below).
Parameters: Nlev (int) – number of vertical levels, i.e.
Nlev = 51 for intermediate half-levels (e.g. for w) Nlev = 50 for main levels (e.g. for u)
Returns: lev – mean vertical COSMO-DE levels Return type: numpy array (1-dim) References
M. Baldauf, J. Foerstner, S. Klink, T. Reinhardt, C. Schraff, A. Seifert und K. Stephan “Kurze Beschreibung des Lokal-Modells Kuerzestfrist COSMO-DE (LMK) und seiner Datenbanken auf dem Datenserver des DWD”, Version 1.6, Stand: 31. 03. 2011
Tabelle 1, Seite 29
-
tropy.analysis_tools.grid_and_interpolation.
create_interpolation_index
(lon1, lat1, lon2, lat2, xy=False)¶ Given georeference of two grids in lon and lat, an index is built to map a field on grid 1 to grid 2.
Parameters: - lon1 (numpy array (2-dim)) – longitude of input grid 1
- lat1 (numpy array (2-dim)) – latitude of input grid 1
- lon2 (numpy array (2-dim)) – longitude of target grid 2
- lat2 (numpy array (2-dim)) – latitude of target grid 2
- xy (bool, optional, default = False) – switch if georef field are interpreted as local Cartesian coordinates: Then no internal transformation is used.
Returns: - ir (numpy array, 2-dim, dtype = int) – index field for transforming rows
- ic (numpy array, 2-dim, dtype = int) – index field for transforming columns
-
tropy.analysis_tools.grid_and_interpolation.
curve_flow_filter
(f, numberOfIterations=5)¶ Smoothing filter depending on isoline curvature. Interface for curvature flow filter from simpleITK toolkit.
Parameters: - f (numpy array (2-dim)) – 2d field to be filtered (smoothed)
- numberOfIterations (int, optional, default = 5) – number of iterations, increases smooting effect
Returns: f_sm – smoothed 2d field
Return type: numpy array (2-dim)
Notes
Only works if SimpleITK is installed !!!
-
tropy.analysis_tools.grid_and_interpolation.
cutout_cluster
(c, nc, nedge=30)¶ Makes a cutout of a categorial field c for an object of class / number nc.
Parameters: Returns: ccut – cutout of categorial field c
Return type: numpy array, 2-dim, dtype = int
-
tropy.analysis_tools.grid_and_interpolation.
cutout_field4box
(f, ind, bsize, **kwargs)¶ Cuts out a field based on center pix index and box size.
Parameters: Returns: fcut – resulting quadratic cutout
Return type: 2d or 3d numpy array
-
tropy.analysis_tools.grid_and_interpolation.
cutout_fields
(fin, slices, vaxis=0)¶ Cuts out a field or a list of similar fields given row and column slices.
Parameters: Returns: fout – resulting cutout (possibly a list of cutted fields)
Return type: 2d or 3d numpy array or list of fields
-
tropy.analysis_tools.grid_and_interpolation.
get_index
(p, lon, lat)¶ Given a point p the nearest grid point is returned.
Parameters: - p (numpy array) – point in the 2d grid
- lon (numpy array (2-dim)) – grid values of the 1st dimension (can be longitude, x, etc.)
- lat (numpy array (2-dim)) – grid values of the 2nd dimension (can be latitude, y, etc.)
Returns: - ir (numpy array (2-dim)) – row index (the 1st)
- ic (numpy array (2-dim)) – column index (the 2nd)
-
tropy.analysis_tools.grid_and_interpolation.
i2iset
(v, i)¶ Returns an index set for a n-dim numpy array given an index i which points to a position within the same but flattened array.
It is assumed that the array was flattened in C-mode.
Parameters: - v (numpy array (n-dim)) – field which determines the rank and shape
- i (int) – input index which points to the position in v.flatten()
Returns: iset – index set which points to the position in v
Return type: Notes
This routine helps you to locate a maximum in a n-dimensional array using e.g. the
np.argmax
function.
-
tropy.analysis_tools.grid_and_interpolation.
interpolate_field_to_hor_pos
(pos, lon, lat, field, **kwargs)¶ Given a sequence of points (or just one) a field (2d or 3d) given a grid is interpolated to the new positions.
Parameters: - pos (numpy array (2-dim)) – sequence of points given as np.array((x,y)) where x and y can be n-dimensional
- lon (numpy array (2-dim)) – grid values of the 1st dimension (can be longitude, x, etc.)
- lat (numpy array (2-dim)) – grid values of the 2nd dimension (can be latitude, y, etc.)
- field (numpy array) – 2d or 3d field to be interpolated to pos
- **kwargs (dict content) – keyword arguments for the griddata routine of the scipy.interpolate package can be passed through
Returns: f – field interpolated to the position sequence pos, f has the dimension n x Nlev, where n is the dimension of pos, Nlev the vertical dimension
Return type: numpy array
-
tropy.analysis_tools.grid_and_interpolation.
ldiff
(var, axis=-1)¶ Calculates difference of a field at adjacent levels. Now, wrapper for
numpy.diff
with default on last axis.Parameters: - var (numpy array (n-dim where n = [1,2,3])) – field; 1d, 2d or 3d array
- axis (int, optional, default = -1) – axis used for layer avering
Returns: dvar – level difference
Return type: numpy array (n-dim where n = [1,2,3])
-
tropy.analysis_tools.grid_and_interpolation.
ll2xy
(lon, lat, lon0=10.0, lat0=50.0, R=6380.0)¶ Transformation between longitude and latitude and local Cartesian coordinates in west-east direction (x) and south-north direction (y). The local projection is called “sinusoidal”.
Parameters: Returns: - x (numpy array) – Cartesian coordinate in west-east direction (increasing towards the East)
- y (numpy array) – Cartesian coordinate in south-north direction (increasing towards the North)
Notes
The unit of Earth’s radius determines the units of x and y. Default (km).
References
See https://en.wikipedia.org/wiki/Sinusoidal_projection for further info.
-
tropy.analysis_tools.grid_and_interpolation.
ll2xyc
(lon, lat, mlon=None, mlat=None, lon0=0, lat0=0.0)¶ Applies a centered version of ll2xy. The centering is around the mean lon/lat value and lon0, lat0 only define the x/y offset.
Parameters: - lon (numpy array) – longitude
- lat (numpy array) – latitude
- mlon (float, optional, default = None) –
longitude center point of projection
if
None
then average oflon
is used - mlat (float, optional, defaulft = None) –
latitutde center point of projection
if
None
then average oflat
is used - lon0 (float, optional, default = 0.) – zero longitude (for x-offset)
- lat0 (float, optional, default = 0.) – zero latitude (for y-offset)
Returns: - x (numpy array) – x-coordinate
- y (numpy array) – y-coordinate
-
tropy.analysis_tools.grid_and_interpolation.
lmean
(var, axis=-1)¶ Calculates the layer mean of a field.
Parameters: - var (numpy array (n-dim where n = [1,2,3])) – field; 1d, 2d or 3d array
- axis (int, optional, default = -1) – axis used for layer avering
Returns: varm – layer mean
Return type: numpy array (n-dim where n = [1,2,3])
-
tropy.analysis_tools.grid_and_interpolation.
low2hres_region
(region)¶ Based on low-res. region of MSG channels an correspoding HRV region is calculated.
Parameters: region (tuple of two tubles) – region slice for cuuting a field ((row1, row2), (col1, col2)) Returns: hrv_region Return type: corresponding HRV region as ((row1, row2), (col1, col2)) Notes
Calculation
It is assumed that the hrv region is a dependent variable which depends on the low res region attribute That means: Given the low res region cutout hrv region is determined as corresponding cutout which
- exactly fits with the low res region and
- refers to the artifical hrv full disk of 11136 x 11136
low res pixel with the index (I,J) = (0,0) has a high res pixel with index (i_m, j_m) = (2,2) in the middle and (i_u, j_u) = (1,1) in the upper corner see doc “MSG Level 1.5 Image Data Format Description”, Fig.8 which then leads to the relation between the low res pixel (I,J) and its corresponding upper corner high res pixel (i_u, j_u): (i_u, j_u) = 3 * (I, J ) + 1
-
tropy.analysis_tools.grid_and_interpolation.
make_add_edge
(var)¶ Adds egde region to 2d data field. Values in added left and right column, and lower and upper row are linearily extrapolated from their neighbors.
Parameters: var (numpy array (2-dim)) – 2d variable field (Nrow, Ncols) Returns: var_new – 2d variable field with added edge values (Nrows + 2, Ncols + 2) Return type: numpy array (2-dim)
-
tropy.analysis_tools.grid_and_interpolation.
make_hrv_upscaling
(v)¶ A 2d field is upscaled to 3-fold higher resolution using linear interpolation. Edge values are determined by linear extrapolation.
Parameters: v (numpy array (2-dim)) – 2d variable field (Nrow, Ncols) Returns: vhigh – variable field with 3-times higher resolution (3*Nrows, 3*Ncols) Return type: numpy array (2-dim)
-
tropy.analysis_tools.grid_and_interpolation.
make_index_set
(nrows, ncols)¶ This simple routine makes an 2d index set for given row and column number of a 2d field.
Parameters: Returns: - ii (numpy array, dtype = int) – 2d field of rows indices
- jj (numpy array, dtype = int) – 2d field of column indices
-
tropy.analysis_tools.grid_and_interpolation.
make_vert_cut
(p1, p2, lon, lat, vg, **kwargs)¶ Makes a horizontal - vertical cut through a 3d field.
Parameters: - p1 (list or numpy array) – 1st end of the cutting line (lon, lat)
- p2 (list or numpy array) – 2nd end of the cutting line (lon, lat)
- lon (numpy array (2-dim)) – longitude
- lat (numpy array (2-dim)) – latitude
- vg (numpy array (3-dim)) – 3d field to be interpolated
- **kwargs (dict content) – keyword arguments for the griddata routine of the scipy.interpolate package can be passed through
Returns: s (numpy array (1-dim)) – distance of km from one to the other end of the cutting line
lo (numpy array (1-dim)) – longitude along the cut
la (numpy array (1-dim)) – latitude along the cut
v (numpy array (2-dim)) – field interpolated to the cutting line
Vertical levels are kept.
-
tropy.analysis_tools.grid_and_interpolation.
mid2edge_gridvector
(x_mid, x_edge0)¶ Converts an mid-point based grid vector into an edge-based grid vector.
Parameters: - x_mid (numpy array (1-dim)) – mid-point-based grid vector (length N)
- x_edge0 (float) – left edge
Returns: x_edge – edge-based grid vector (length N + 1)
Return type: numpy array (1-dim)
-
tropy.analysis_tools.grid_and_interpolation.
region2slice
(lon, lat, region)¶ Given a region of ((lon1,lon2),(lat1, lat2)) the routine provides a slicing ((row1, row2), (col1, col2)) than accomodates the region.
Parameters: - lon (numpy array (2-dim)) – longitude
- lat (numpy array (2-dim)) – latitude
- region (tuple of two tubles) – ((lon1, lon2), (lat1, lat2))
Returns: slice – region slice for cuuting a field ((row1, row2), (col1, col2))
Return type: tuple of two tubles
-
tropy.analysis_tools.grid_and_interpolation.
remap_field
(lon, lat, f, dr=0.5)¶ Do nearest nearbor remapping on a simple equi-distant local cartesian coordiante grid.
Uses curvature flow filter for smoothing and pixel edge correction.
Parameters: - lon (numpy array, (2-dim)) – longitude
- lat (numpy array, (2-dim)) – latitude
- f (numpy array, (2-dim)) – 2d field to be interpolated
- dr (float, optional, default = 0.5) – grid spacing in km
Returns: - xnew (numpy array, (2-dim)) – regridded, equi-distant x-coordinate in km
- ynew (numpy array, (2-dim)) – regridded, equi-distant y-coordinate in km
- fint (numpy array, (2-dim)) – remapped field
-
tropy.analysis_tools.grid_and_interpolation.
simple_pixel_area
(lon, lat, xy=False, uncertainty=False)¶ Approximates the grid box area of a lon - lat grid.
The grid is treated as if it is an equidistant grid in local Cartesian coordinates on Earth’s surface (i.e. in x and y). The grid point is placed at each of the for corners of the grid box and the corresponding average area ist calculated.
Parameters: - lon (numpy array (2-dim)) – longitude
- lat (numpy array (2-dim)) – latitude
- xy (bool, optional, default = False) – switch if
lon
&lat
are interpreted as local Cartesian coordinates - uncertainty (bool, optional, default = False) –
switch if std of pixel area is output
Deviation are caused by effects of the not-rectangular grid.
Returns: - a (numpy array (2-dim)) – grid box area
- da (numpy array (2-dim), optional) – standard deviation of grid box area, if
uncertainty == True
-
tropy.analysis_tools.grid_and_interpolation.
spline_smooting
(x, y, s=0.1, fixed_endpoints=True)¶ Smoothes a curve with smooting spline.
Parameters: - x (numpy array (1-dim)) – abscissa values
- y (numpy array (1-dim)) – ordinate values
- s (float, optional, default = 0.1) – smooting parameter
- fixed_endpoints (nool, optional, default = True) – switch, if endpoinds should be hold fixed
Returns: y_smooth – smoothed version of y
Return type: numpy array (1-dim)
Notes
Uses the function
scipy.interpolate.UnivariateSpline
.
-
tropy.analysis_tools.grid_and_interpolation.
tube_cutout4box
(v3d, i0, boxsize)¶ Performs tube cutout with given index set. The center index might not be constant.
Parameters: Returns: vtube – 3d tube, quadratic cutout in the last two dimensions
Return type: numpy array (3-dim)
Notes
This function can be used top make vertical cutouts, but a transpose command has to be applied before and after.
-
tropy.analysis_tools.grid_and_interpolation.
xy2ll
(x, y, lon0=10.0, lat0=50.0, R=6380)¶ Transformation between local Cartesian coordinates in west-east direction (x) and south-north direction (y) and longitude and latitude. This assumes a sinusoidal projection. Inversion of the function
ll2xy
.Parameters: - x (numpy array) – Cartesian coordinate in west-east direction (increasing towards the East)
- y (numpy array) – Cartesian coordinate in south-north direction (increasing towards the North)
- lon0 (float, optional, default = 10.) – arbitrary initial longitude for x = 0
- lat0 (float, optional, default = 50.) – arbitrary initial latitude for y = 0
- R (float, optional, default = 6380.) – approximation for the radius of Earth
Returns: - lon (numpy array) – longitude
- lat (numpy array) – latitude
Notes
The unit of Earth’s radius determines the units of x and y. Default (km).
References
See https://en.wikipedia.org/wiki/Sinusoidal_projection for further info.
tropy.analysis_tools.optical_flow¶
-
tropy.analysis_tools.optical_flow.
Lagrangian_change
(f3d, tracer_field=None, gauss_sigma=1.0, flow_input=None, vmin=0, vmax=1.0)¶ Calculates Lagrangian change of a field using possibly another field to generate optical flow.
Parameters: - f3d (numpy array) – the field (time on 1st axis) for which Lagrangian change is calculated
- tracer_field (numpy array, optional, default = None) – the field (time on 1st axis) for which displacement is derived
if None: displacement is derievd from
f3d
- sigma_gauss (float, optional, default = 1.) – sigma for Gaussian smoothing for Lagrangian change fields
- flow_input (numpy array, optional, default = None) –
predefined flow field for u/v calculation
if None: flow is derived from either
f3d
ortracer_field
- vmin (float, optional, default = 0.) – lower bound of fields for rescaling needed for flow calculation
- vmax (float, optional, default = 1.) – upper bound of fields for rescaling needed for flow calculation
Returns: df – Lagrangian change of field
f3d
Return type: numpy array
Notes
Lagrangian changes are derived from displacements between subsequent time slots and hence, time dimension is ntimes - 1.
-
tropy.analysis_tools.optical_flow.
displacement_from_opt_flow
(f1, f2, method='farneback', **kwargs)¶ Derives displacement vector between to fields f1 and f2 using optical flow method (either farnebaeck, or tvl1).
Parameters: - f1 (numpy array, 2dim) – field for past time step
- f2 (numpy array, 2dim) – field for actual time step
- method (str, optional, default = 'farneback') – method selected for optical flow calculations possible options: ‘farneback’ and ‘tvl1’
- kwargs (dict) – parameters for opencv optical flow algorithm if not given, default is taken from _farnebaeck_default_parameters
Returns: flow – displacement vector as index shift 1st component is related to u, i.e. displacement along row, 2nd to v, i.e. along column
Return type: numpy array, 3dim
-
tropy.analysis_tools.optical_flow.
displacement_from_opt_flow_farneback
(f1, f2, vmin=None, vmax=None, **kwargs)¶ Derives displacement vector between to fields f1 and f2 using optical flow method after Farneback (2003).
Parameters: - f1 (numpy array, 2dim) – field for past time step
- f2 (numpy array, 2dim) – field for actual time step
- vmin (float, optional, default = None) – lower bound of fields for rescaling
- vmax (float, optional, default = None) – upper bound of fields for rescaling
- kwargs (dict) – parameters for opencv optical flow algorithm if not given, default is taken from _farnebaeck_default_parameters
Returns: flow – displacement vector as index shift 1st component is related to u, i.e. displacement along row, 2nd to v, i.e. along column
Return type: numpy array, 3dim
-
tropy.analysis_tools.optical_flow.
displacement_from_opt_flow_tvl1
(f1, f2, vmin=None, vmax=None, **kwargs)¶ Derives displacement vector between to fields f1 and f2 using optical flow method after Zach et al (2007).
Parameters: - f1 (numpy array, 2dim) – field for past time step
- f2 (numpy array, 2dim) – field for actual time step
- vmin (float, optional, default = None) – lower bound of fields for rescaling
- vmax (float, optional, default = None) – upper bound of fields for rescaling
- kwargs (dict) – parameters for opencv optical flow algorithm if not given, default is taken from _farnebaeck_default_parameters
Returns: flow – displacement vector as index shift 1st component is related to u, i.e. displacement along row, 2nd to v, i.e. along column
Return type: numpy array, 3dim
-
tropy.analysis_tools.optical_flow.
displacement_vector
(tracer_field, vmin=0, vmax=1.0)¶ Calculates displacement vector for several times.
Parameters: Returns: flow – displacement vector
Return type: numpy array
-
tropy.analysis_tools.optical_flow.
flow_velocity
(lon, lat, f3d, dt=5.0, flow_input=None, vmin=0, vmax=1.0)¶ Calculates flow field for Lagrangian displacements of tracers in the field.
Parameters: - lon (numpy array) – longitude
- lat (numpy array) – latitude
- f3d (numpy array) – 3d fields (time on 1st axis)
- dt (float, optional, default = 5.) – time interval in minutes
- flow_input (numpy array, optional, default = None) – predefined flow field for u/v calculation not use if None
- vmin (float, optional, default = 0.) – lower bound of fields for rescaling needed for flow calculation
- vmax (float, optional, default = 1.) – upper bound of fields for rescaling needed for flow calculation
Returns: - u (numpy array) – zonal velocity,
- v (numpy array) – meridional velocity
Notes
Velocities are derived from displacements between subsequent time slots and hence, time dimension is ntimes - 1.
-
tropy.analysis_tools.optical_flow.
morph_trans_opt_flow
(f, flow, method='forward')¶ Applies morphological transformation of field f given a displacement field.
Parameters: - f (numpy array) – 2d field that is transformed (source)
- flow (numpy array) – displacement vector between source and target stage
Returns: ftrans – transformed field
Return type: numpy array
tropy.analysis_tools.parallax¶
tropy.analysis_tools.segmentation¶
Package to perform segmentation of fields.
The special emphasis is on threshold-based segmentation of 2-dim atmoshperic fields for which object-based analysis techniques should be applied.
Notes
- the package has variants of connected compund analsis & watershedding
- some split-and-merge rules exists
- most of the method might be also applied to 3-dim fields
-
tropy.analysis_tools.segmentation.
clustering
(f, thresh, cluster_method='watershed', min_size=20, **kwargs)¶ General interface to aplly either connectivity or watershed clustering.
Parameters: - f (np.array) – field to be segmented
- thresh (float) – threshold for masking, set foreground to f > thresh
- cluster_method ({'watershed', 'watershed_merge', 'connect'}, optional) – switch between watershed or connectivity clustering
- min_size (int, optional, default = 20) – minimum cluster size (px) left in the cluster field
- **kwargs (dict, optional) – keyword argument for selected cluster algorithm
Returns: c – categorial cluster field
Return type: np.array
-
tropy.analysis_tools.segmentation.
combine_object_stack
(c3d)¶ Sequentially combines object label fields by searching for new objects that have no predecessor in the higher hierarchy level.
Parameters: c3d (numpy array, 3dim or 4dim) – labeled object data stack Returns: c – combined label field Return type: numpy array, 2dim or 3dim
-
tropy.analysis_tools.segmentation.
connected_sequences2pairlists
(connected_pairs_list)¶ The routine gets a set of paired numbers which show connection between numbers and collects all connections in a list of pairs.
Parameters: connected_pairs_list (list) – list of number pairs Returns: pair_list – dictionary that contains pair lists Return type: dict
-
tropy.analysis_tools.segmentation.
connectivity_clustering
(f, thresh, filter_method='curve', numberOfIterations=5, ctype=4, cluster_masking=True, siggauss=1.0, **kwargs)¶ Applies connectivity clustering to a field.
Parameters: - f (np.array) – field to be segmented
- thresh (float) – threshold for masking, set foreground to f > thresh
- filter_method ({'gauss', 'curve'}, optional) –
filter method used to smooth the input field
- ’gauss’ for 2d-Gaussian
- ’curve’ for curvature flow filter
- numberOfIterations (int, optional) – number of repeated application of curvature flow filter the larger then smoother if selected
- ctype ({4, 8}, optional) – set either 4- or 8-connectivity (edge vs. edges+corners)
- cluster_masking ({True, False}, optional) – if original (True) or smoothed (False) threshold mask is used.
- siggauss (float, optional) – sigma for Gaussian filter if selected
- **kwargs (dict, optional) – set of additional, but not used keywords
Returns: c – categorial cluster field
Return type: np.array
-
tropy.analysis_tools.segmentation.
markers_from_iterative_shrinking
(mask, marker_shrinkage_Niter=3, marker_shrinkage_dmin=0.1, **kwargs)¶ The routine tries to find core parts of objects via iterative shrinking.
Each object is normalized by its maximum distance-to-background which makes the algorithm scale-invariant.
Parameters: mask (np.array) – binary mask - marker_shrinkage_Niter : int, optional
- option for marker creation, how often shrinking is applied
- marker_shrinkage_dmin : float, optional
option for marker creation
How much of the edge is taken away. Distance is measured relative to the maximum distance, thus 0 < dmin < 1.
Returns: markers – markers field with increasing index per marker region Return type: np.array
-
tropy.analysis_tools.segmentation.
multithreshold_clustering
(f, thresh_min, thresh_max, nthresh=10, use_percentile_threshold=False, **kws)¶ Performs sequential segmentation and returns the combine result only keeping the predecessor object alive.
Parameters: - f (numpy array, 2dim or 3dim) – input field
- thresh_min (float) – minimum threshold value
- thresh_max (float) – maximum threshold value
- nthresh (int, optional) – number of threshold values
- use_percentile_threshold ({True, False}, optional) – if threshold list is derived from field percentiles
- **kws (dict) – keywords passed to segmentation routine
Returns: c – combined label field
Return type: numpy array, same shape as f
-
tropy.analysis_tools.segmentation.
percentiles_from_cluster
(f, c, p=[25, 50, 75], index=None)¶ Calculates percentiles of cells in an segmented (labeled) field.
Functionality missing in scipy.ndimage.measurements.
Parameters: - f (np.array) – the field as basis for percentile calculation
- c (np.array) – categorial cluster field
- p (list or np.array, optional) – percentiles array
- index (list or None, optional) – list of object labels for which percentiles are calculated
Returns: pc – percentiles per cell (including background (set to zero))
Return type: np.array
-
tropy.analysis_tools.segmentation.
remove_clustersize_outside
(c, min_size=6, max_size=inf)¶ Counts numbers of pixels per cluster cell and removes clusters outside a size range defined by a minimum and maximum size.
In addition, it sorts clusters by size in decreasing order.
Parameters: Returns: c_re – categorial cluster field with sorted objects (from large to small)
- objects smaller than or equal min_size are removed
- objects larger than or equal max_size are removed
Return type: np.array
-
tropy.analysis_tools.segmentation.
remove_clustersize_outside_slow
(c, min_size=6, max_size=inf, sort=True)¶ OLD VERSION –> much slower …
Counts numbers of pixels per cluster cell and removes clusters outside a size range defined by a minimum and maximum size.
In addition, it sorts clusters by size in decreasing order.
Parameters: Returns: c_re – categorial cluster field with sorted objects (from large to small)
- objects smaller than or equal min_size are removed
- objects larger than or equal max_size are removed
Return type: np.array
-
tropy.analysis_tools.segmentation.
remove_small_clusters
(c, min_size=6)¶ Counts numbers of pixels per cluster cell and removes clusters smaller than minimum size.
In addition, it sorts clusters by size in decreasing order.
Parameters: - c (np.array) – categorial cluster field
- min_size (int, optional) – minimum cluster size (px) left in the cluster field
Returns: c_re – categorial cluster field with sorted objects (from large to small)
objects smaller than or equal min_size are removed
Return type: np.array
-
tropy.analysis_tools.segmentation.
sequential_segmentation
(f, thresh_min, thresh_max, nthresh=10, use_percentile_threshold=False, **kws)¶ Performs sequential segmentation.
Parameters: - f (numpy array, 2dim or 3dim) – input field
- thresh_min (float) – minimum threshold value
- thresh_max (float) – maximum threshold value
- nthresh (int) – number of threshold values
- use_percentile_threshold ({True, False}, optional) – if threshold list is derived from field percentiles
- **kws (dict) – keywords passed to segmentation routine
Returns: c3d – stack of labeled field, from max. thresh to min. thresh
Return type: numpy array, one dimension added
-
tropy.analysis_tools.segmentation.
set_connectivity_footprint
(ctype, ndim)¶ Return connectivity footprint for either 4-connectivity or 8-connectivity.
4-connectivity only includes neigboring sides, 8-connectivity also includes neighboring diagonal values.
Parameters: - ctype ({4, 8}) – set either 4- or 8-connectivity (edge vs. edges+corners)
- ndim ({2,3}) – dimension of the field (either 2-dim or 3-dim)
Returns: footprint – ndim-dimensional connectivity footprint
Return type: np.array
-
tropy.analysis_tools.segmentation.
sort_clusters
(c)¶ It sorts clusters by size (number of pixels) in decreasing order.
Parameters: c (np.array) – categorial cluster field Returns: c_re – categorial cluster field with sorted objects (from large to small) Return type: np.array
-
tropy.analysis_tools.segmentation.
watershed_clustering
(f, thresh, dradius=3, ctype=4, marker_field='dist', filter_method='curve', numberOfIterations=5, cluster_masking=True, exclude_border=False, marker_method='mahotas_regmax', siggauss=1.0, marker_shrinkage_Niter=3, marker_shrinkage_dmin=0.1, **kwargs)¶ Applies watershed clustering to a field.
Parameters: - f (np.array) – field to be segmented
- thresh (float) – threshold for masking, set foreground to f > thresh
- dradius (int, optional) – minimal distance for neighboring maxima
- ctype ({4, 8}, optional) – set either 4- or 8-connectivity (edge vs. edges+corners)
- marker_field (str, optional) –
switch if input field f or a distance transform as used to set the watershed markers
if ‘dist’, then max. in Euclidean distance to background is used to set initial markers, else the max. in field f are taken
- filter_method ({'gauss', 'curve'}, optional) –
filter method used to smooth the input field
- ’gauss’ for 2d-Gaussian
- ’curve’ for curvature flow filter
- numberOfIterations (int, optional) – number of repeated application of curvature flow filter the larger then smoother if selected
- cluster_masking ({True, False}, optional) – if original (True) or smoothed (False) threshold mask is used.
- exclude_border ({True, False}, optional) – if clusters that touch domain borders are excluded (set to zero)
- marker_method (str, optional) –
defines a method for marker assignment
- ’brute_force_local_max’ : just uses scipy.ndimage.maximum_filter
- ’skimage_peak_local_max’ : uses skimage local maximum routine
- ’mahotas_regmax’ : uses mahotas regional maximum detection
- ’iterative_shrinking’ : uses iterative object shrinking depending on relative distance-to-background
- siggauss (float, optional) – sigma for Gaussian filter if selected
- marker_shrinkage_Niter (int, optional) – option for marker creation, how often shrinking is applied
- marker_shrinkage_dmin (float, optional) –
option for marker creation
How much of the edge is taken away. Distance is measured relative to the maximum distance, thus 0 < dmin < 1.
Returns: c – categorial cluster field
Return type: np.array
-
tropy.analysis_tools.segmentation.
watershed_merge_clustering
(f, thresh, min_size=20, merge_ratio=0.5, **kwargs)¶ Applies watershed clustering to a field and then tries to merge them again.
The merging is based on the ratio between maximum distances-to-background within the subclusters and on the cluster broders.
Parameters: - f (np.array) – field to be segmented
- thresh (float) – threshold for masking, set foreground to f > thresh
- min_size (int, optional, default = 20) – optional, minimum cluster size (px) left in the cluster field
- merge_ratio (float, optional) –
essentially the ratio between interface length and cluster size
if the interface is longer than merge_ratio * cluster_size than to two connected objects are merged.
- **kwargs (dict, optional) – keywords passed to watershed_clustering routine
Returns: c_update – categorial cluster field (updated for merges)
Return type: np.array
tropy.analysis_tools.statistics¶
Package for some statistical utility functions.
Special Highlights are:
- conditioned percentiles
- rank transformation / distribution mapping
- drawing from an empirical 1d / 2d distribution
-
tropy.analysis_tools.statistics.
KStest
(x, y, Nx_eff=None, Ny_eff=None)¶ Calculates Kolmogorov Smirnov test for the distributions of two samples in a standard way, but allows for input of effective degrees of freedom.
Nullhypothesis that the two samples share the same distribution is rejected if p-value is smaller than a threshold (typically 0.05 or 0.01).
Parameters: - x (numpy array) – sample of 1st variable
- y (numpy array) – sample of 2nd variable
- Nx_eff (int, optional, default = None) – number of effective degrees of freedom of variable x set to size of x if Nx_eff == None
- Ny_eff (int, optional, default = None) – number of effective degrees of freedom of variable y set to size of y if Ny_eff == None
Returns: - D (float) – KS test statistic
- pval (float) – p-value
-
tropy.analysis_tools.statistics.
autocorr
(x, Nmax=None)¶ Calculates autocorrelation function of masked array.
Parameters: - x (numpy array) – sample of 1st variable
- Nmax (int, optional, default = None) – maximum lag Nmax is set to len(x) - 1 if Nmax == None
Returns: c – autocorrelation function for lag 0 … Nmax,
Return type: numpy array
-
tropy.analysis_tools.statistics.
cond_perc_simple
(v1, v2, x1, p=[10, 25, 50, 75, 90], sig=0, medsize=0)¶ This is a simple routine to emulate the Rosenfeld T-Re plots.
Input are v1 (T) and v2 (Re) and percentiles of v2 conditioned on intervals of v1 given by x1 are calculated.
Parameters: - v1 (numpy array) – sample of 1st variable of bi-variate distribution
- v2 (numpy array) – sample of 2nd variable of bi-variate distribution
- x1 (numpy array) – intervals at which 1st variable is gathered
- p (list, optional, default = [10, 25, 50, 75, 90]) – percentiles of 2nd variables which will be calculated for each interval [ x1[i], x1[i + 1] )
- sig (float, optional, default = 0.) – sigma for Gaussian filter to smooth percentile curves (not applied for sig == 0)
- medsize (int, optional, default = 0) – footprint size of median filter to smooth percentile curves (not applied for sig != 0 OR medsize == 0)
Returns: v2p – percentile statistics of variable
v2
conditioned on differentx1
intervalsReturn type: numpy array
-
tropy.analysis_tools.statistics.
correlation_bootstrap
(x, y, perc=[2.5, 50, 97.5], xerr=0, yerr=0, Nsample=1000, corr=<function pearsonr>)¶ Calculates percentile values of possible correlation using a bootstrap approach.
(x,y) are interpreted as mean values and error bars as standard deviations of 2d Gaussian distribution per point and the fit is repeated for random draws out of these. After Nsampe fits, the range of possible fits is analyzed for their percentiles.
Parameters: - x (numpy array, 1-dim) – independent values
- y (numpy array, 1-dim) – dependent values
- perc (list or numpy array (1-dim), optional, default = [2.5, 50, 97.5]) – list of percentiles for which correlation is estimated
- xerr (float or numpy array (1-dim), optional, default = 0.) – standard deviation of x
- yerr (float or numpy array (1-dim), optional, default = 0.) – standard deviation of y
- Nsample (int, optional, default = 1000) – size of bottstrap sample
- corr (func, optional, default =
scipy.stats.pearsonr
) – function object used for correlation calculation
Returns: corr – percentiles of possible correlation obtained from bootstrapping
Return type: numpy array, 1-dim
-
tropy.analysis_tools.statistics.
crosscorr
(x, y, Nmax=None)¶ Calculates time-lagged crosscorrelation function of masked array.
Parameters: - x (numpy array) – 1st masked input array, evaluated at t + tau
- y (numpy array) – 2nd masked input array
- Nmax (int, optional, default = None) – maximum lag Nmax is set to len(x) - 1 if Nmax == None
Returns: c – autocorrelation function for lag 0 … Nmax,
Return type: numpy array
-
tropy.analysis_tools.statistics.
cumsum_data_fraction
(h, divide_by_total_sum=True)¶ The function uses iso-lines of equal density and maps the fraction of data enclosed by these lines onto it.
Parameters: - h (numpy array (n-dim)) – histogram / density or other variable for which cumsum makes sense
- divide_by_total_sum (bool, optional, default = True) – switch if cumsum field should be divided by total sum this generates a relative field with range (0, 1)
Returns: f – data fraction field
Return type: numpy array (n-dim)
-
tropy.analysis_tools.statistics.
draw_from_empirical_1ddist
(bins, hist, Nsamp=100)¶ Draw random number from an empirical distribution. Implemented for 1d histograms.
Parameters: - bins (numpy array) – bins edges (1dim)
- hist (numpy array (1-dim)) – histogram values (either absolute frequencies or relative)
- Nsamp (int, optional, default = 100) – number of random samples that are drawn
Returns: rvalues – random values that are distributed like hist
Return type: numpy array
-
tropy.analysis_tools.statistics.
draw_from_empirical_dist
(bins, hist, Nsamp=100, discrete_version=False)¶ Draw random number from an empirical distribution. Implemented for 2d histograms.
Parameters: - bins (list of two numpy arrays) – bins edges (2dim)
- hist (numpy array (2-dim)) – histogram values (either absolute frequencies or relative)
- Nsamp (int, optional, default = 100) – number of random samples that are drawn
- discrete_version (bool, optional, default = False) – if True: samples can only be placed at bin midpoints if False: samples are uniformly distributed within each bin
Returns: rvalues – random values (2d) that are distributed like hist
Return type: list of two numpy arrays
-
tropy.analysis_tools.statistics.
fdistrib_mapping
(fin, fmap, keep_range=False)¶ Performs a transformation of field fin to have the same distribution as fmap.
Parameters: - fin (numpy array (n-dim)) – input field
- fmap (numpy array (n-dim)) – field from which distribution is taken
- keep_range (bool, optional, default = False) – switch if output field is scaled to input range (with min-max transformation)
Returns: fout – transformed field fin
Return type: numpy array (n-dim)
-
tropy.analysis_tools.statistics.
get_outlier_from_residuals
(e, thresh=3.0)¶ Calculate outliers from residuals using percentile-based standardization.
Parameters: - e (numpy array) – input residuals
- thresh (float, optional, default = 3.) – value of residual (standardized) at which outliers are identified
Returns: m – outlier mask
Return type: numpy array
-
tropy.analysis_tools.statistics.
normalize_field
(f, vmin=None, vmax=None)¶ Normalization of a field is applied.
Parameters: Returns: f_norm – 2dim field
Return type: numpy array, 2dim
-
tropy.analysis_tools.statistics.
odrfit_with_outlier_removal
(func, x, y, xerr=0, yerr=0, n_iter=10, outlier_thresh=3.0, p0=None)¶ Calculates fitting parameter object using a orthogonal distance regression.
(x,y) are interpreted as mean values and error bars as standard deviations of 2d Gaussian distribution per point. The range of possible fits is analyzed for their percentiles.
Parameters: - func (function) – fitting function func(args, x), args: parameters of the function ! ARGS HAVE BEEN TURNED !
- x (numpy array, 1-dim) – independent values
- y (numpy array, 1-dim) – dependent values
- xerr (float or numpy array (1-dim), optional, default = 0.) – standard deviation of x
- yerr (float or numpy array (1-dim), optional, default = 0.) – standard deviation of y
- n_iter (int, optional, default = 10) – numbers of iteration used for outlier removal
- outlier_thresh (float, optional, default = 3.) – threshold for outlier removal (standardized fit residuals)
- p0 (list or tuple, optional, default = None) – first guess of fitted parameters (for increased convergence)
Returns: fit_result – result of the ODR fit
Return type: `` scipy.odr.ODR``
-
tropy.analysis_tools.statistics.
rank_transformation
(f, normalize=True, gamma=1.0)¶ The routine performs rank transformation of a field, meaning that the ranks of the individual field values are returned.
Parameters: Returns: f – field ranks
Return type: numpy array (n-dim)
tropy.analysis_tools.thermodynamic_variables¶
Package for the calculation of several thermodynamical properties of moist air (including condensate).
ToDo
- Some final tests of the routines should be done.
- There was an issue with mixing ratio vs. specific humidity, but might be resolved …
-
tropy.analysis_tools.thermodynamic_variables.
H2r
(hrel, p, T)¶ Converts relative humidity H into mixing ratio r.
Parameters: - hrel (numpy array) – relative humidity in [%]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: es – water vapor pressure in [Pa]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
absolute_humidity
(r, p, T)¶ Calculates absolute humidity.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: rho_v – absolute humidity / m**3]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
both_mass_fractions2mixing_ratios
(qv, qw)¶ Converts water and the vapor mass fraction to condensed water and vapor mixing ratio.
Parameters: - qv (numpy array) – water vapor mass fraction [kg / kg]
- qw (numpy array) – condensed water mass fraction [kg / kg]
Returns: - r (numpy array) – water vapor mixing ratio [kg / kg]
- rw (numpy array) – condensed water mixing ratio [kg / kg]
-
tropy.analysis_tools.thermodynamic_variables.
dew_point
(r, p, T)¶ Calculates dew point temperature after Markowski book p 13 eq (2.25)
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: Td – dew point temperature in [K]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
dry_air_density
(r, p, T)¶ Calculates dry air density.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: rho_d – dry air density in [kg / m**3]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
dry_air_potential_temperature
(r, p, T)¶ Calculates dry air potential temperature.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: theta_d – dry air potential temperature (K)
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
equivalent_potential_temperature
(r, p, T, eq=15)¶ Calculates equivalent potential temperature. after Bolton (1980) MWR 108, p.1046, eq. (43)
Davies-Jones (2009) MWR137, eq. (6.3)
also using either one of (15), (21) or (22)
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
- eq (int, optional, default = 15) – which eq. of Bolton (1980) to be chosen for lifting condensation level temperature eq in [15, 21, 22]
Returns: theta_E – equivalent potential temperature
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
lifting_condensation_level_temperature
(r, p, T, eq=15)¶ Calculates lifting condensation level temperature. after Bolton (1980) MWR 108, p.1046, eq. (15), (21) or (22)
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
- eq (int, optional, default = 15) – which eq. of Bolton (1980) to be chosen eq in [15, 21, 22]
Returns: TL – lifting condensation level temperature in [K]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
moist_gas_constant
(r)¶ Calculates gas constant of moist air.
Parameters: r (float or numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg] Returns: R_m – gas constant of moist air [J / kg / K] Return type: float or numpy array
-
tropy.analysis_tools.thermodynamic_variables.
moist_potential_temperature
(r, p, T)¶ Calculates moist air potential temperature.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: theta_m – moist air potential temperature (K)
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
moist_specific_heat
(r)¶ Calculates specific heat at constant pressure of moist air.
Parameters: r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg] Returns: c_pm – specific heat at constant pressure of moist air Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
relative_humidity
(r, p, T)¶ Calculates relative humidity.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: H – relative humidity in [%]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
saturation_over_ice
(T, method='PK')¶ Calculates saturation water vapor pressure over ice.
Parameters: - T (numpy array) – temperature [K]
- method (str, optional, default = 'PK') – use an equation which approximates the vapor pressure over ice method in [‘PK’, ‘Murphy’]
Returns: es – water vapor pressure in [Pa]
Return type: numpy array
Notes
‘PK’ refers to Pruppacher and Klett (1997) ‘Murphy’ refers to Murphy and Koop (2005) eq. (7)
References
Murphy, D. M., and T. Koop (2005), Review of the vapour pressures of ice and supercooled water for atmospheric applications, Quart. J. Roy. Meteor. Soc., 131(608), 1539-1565, doi:10.1256/qj.04.94.
-
tropy.analysis_tools.thermodynamic_variables.
saturation_pressure
(T)¶ Calculates saturation water vapor pressure after Bolton (1980) MWR 108, p.1046, eq. (10)
Parameters: T (numpy array) – temperature [K] Returns: es – water vapor pressure in [Pa] Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
specific_humidity
(r)¶ Calculates specific humidity.
Parameters: r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg] Returns: q – specific humidity i.e. vapor mass per moist air mass [kg / kg] Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
thermodynamic_constants
()¶ Sets some thermodynamic constants.
Parameters: None – Returns: const – set of thermodynamic constans saved in dictionary Return type: dict
-
tropy.analysis_tools.thermodynamic_variables.
total_density
(r, r_w, p, T)¶ Calculates total density of moist air with hydrometeors.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- r_w (numpy array) – mixing ratio of condensed water i.e. condensed water mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: rho_d – dry air density in [kg / m**3]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
water_vapor_pressure
(r, p, T)¶ Calculates water vapor pressure.
Parameters: - r (numpy array) – mixing ratio i.e. vapor mass per dry air mass in [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: e – water vapor pressure in [Pa]
Return type: numpy array
-
tropy.analysis_tools.thermodynamic_variables.
watermass_fraction2rw
(qw, r)¶ Converts water mass fraction to condensed water mixing ratio.
Parameters: - qw (numpy array) – condensed water mass fraction [kg / kg]
- r (numpy array) – mixing ration of water vapor [kg / kg]
- p (numpy array) – total gas pressure [Pa]
- T (numpy array) – temperature [K]
Returns: rw – condensed water mixing ratio [kg / kg]
Return type: numpy array