[postgis-users] Extracting variable information from netcdf, imported as raster to a table

Tobias Schmid schmid.tobi at gmail.com
Fri Nov 17 00:14:15 PST 2023


Sorry, the Screenshot was missing.

[image: image.png]

Am Fr., 17. Nov. 2023 um 09:02 Uhr schrieb Tobias Schmid <
schmid.tobi at gmail.com>:

> Hello Manswini,
> Hello Regina,
> I have read your e-mails with great interest. Many of the described
> problems seem familiar to me. I have been working with temporally and
> spatially resolved (weather) raster data for over 10 years.
>
> Here are a few comments and recommendations:
>
> 1. After some missteps I stopped using multiband-raster. Why?
>
>    - What happens if individual days (or hours) are missing?
>    - Can I always trust the order of the bands?
>
> 2. I work with 1-band grids. My database structure looks like this:
> [image: image.png]
>
> 3. The import is using raster2pgsql. Here is snippet:
>
>   def raster2pgsql(outGtiffPath, dbSchema, dbTable, dbHost=conf.dbHost,
>> dbPort=conf.dbPort, dbName=conf.dbName, dbUser=conf.dbUser):
>>     try:
>>         tableNotEmptySQL = f"SELECT EXISTS (SELECT * FROM
>> {dbSchema}.{dbTable} LIMIT 1);"
>>         if executeSQL(tableNotEmptySQL, fetchone=True):
>>             cmd = f"raster2pgsql -a -F -n filename -s 4326 {outGtiffPath}
>> {dbSchema}.{dbTable} | psql -h {dbHost} -p {dbPort} -d {dbName} -U {dbUser}"
>>         else:
>>             cmd = f"raster2pgsql -a -C -F -n filename -s 4326
>> {outGtiffPath} {dbSchema}.{dbTable} | psql -h {dbHost} -p {dbPort} -d
>> {dbName} -U {dbUser}"
>
>
>>         subprocess.call(cmd, shell=True)
>
>
>>         return True
>
>
>>     except Exception as e:
>>         print(f"Importing '{outGtiffPath}' to postgres db failed.")
>>         print(f"{e}\n")
>>         with open(conf.failedRaster2pgsql, 'a') as file:
>>             file.write(f"{datetime.now().strftime('%Y-%m-%d
>> %X')}\t{outGtiffPath}\t{e}\n")
>>         return False
>
>
> 4. I read the parameters "scale" and "offset" via Python and write them to
> the database.
>
>             hourlyRaster = gdal.Open(str(outGtiffPath))
>>             hourlyRasterScale = hourlyRaster.GetRasterBand(1).GetScale()
>>             hourlyRasterOffset = hourlyRaster.GetRasterBand(1).GetOffset()
>>             hourlyRaster = None
>
>
>>             insertTimestampSQL = f"SET TIMEZONE = 'UTC'; UPDATE
>> {dataset}.{dbTable} SET timestamp_w_tz = TO_TIMESTAMP('{timeStamp}',
>> 'YYYY-MM-DD HH24:MI'), rast_scale_factor = {hourlyRasterScale}, rast_offset
>> = {hourlyRasterOffset} WHERE filename = '{outGtiff}';"
>
>
> 5. The query:
>
> select timestamp_w_tz
>> , st_value(rast, st_setsrid(st_makepoint(15, 50), 4326)) * rast_scale_factor
>> + rast_offset as value_temperature
>> from era5.temperature_2m
>> where timestamp_w_tz >= '1982-09-18 00:00'
>> and timestamp_w_tz < '1982-09-19 00:00'
>>
>
> 6. The result (temperature in Kelvin. Of course.)
>
>> +---------------------------------+------------------+
>> |timestamp_w_tz                   |value_temperature |
>> +---------------------------------+------------------+
>> |1982-09-18 00:00:00.000000 +00:00|287.70160804659935|
>> |1982-09-18 01:00:00.000000 +00:00|287.47021164940014|
>> |1982-09-18 02:00:00.000000 +00:00|286.0640335433434 |
>> |1982-09-18 03:00:00.000000 +00:00|285.56675378590086|
>> |1982-09-18 04:00:00.000000 +00:00|285.41434365889944|
>> |1982-09-18 05:00:00.000000 +00:00|285.3075453217306 |
>> |1982-09-18 06:00:00.000000 +00:00|286.3343668343021 |
>> |1982-09-18 07:00:00.000000 +00:00|287.2543900097047 |
>> |1982-09-18 08:00:00.000000 +00:00|290.88219602540977|
>> |1982-09-18 09:00:00.000000 +00:00|292.3985099166719 |
>> |1982-09-18 10:00:00.000000 +00:00|297.1844104010519 |
>> |1982-09-18 11:00:00.000000 +00:00|297.37019500841853|
>> |1982-09-18 12:00:00.000000 +00:00|297.58267920007745|
>> |1982-09-18 13:00:00.000000 +00:00|297.85078752567847|
>> |1982-09-18 14:00:00.000000 +00:00|298.1689575718274 |
>> |1982-09-18 15:00:00.000000 +00:00|297.20666005462874|
>> |1982-09-18 16:00:00.000000 +00:00|296.9697012440353 |
>> |1982-09-18 17:00:00.000000 +00:00|294.35870439679223|
>> |1982-09-18 18:00:00.000000 +00:00|292.8045660944494 |
>> |1982-09-18 19:00:00.000000 +00:00|292.6032067295789 |
>> |1982-09-18 20:00:00.000000 +00:00|289.9332483003572 |
>> |1982-09-18 21:00:00.000000 +00:00|289.59839101402565|
>> |1982-09-18 22:00:00.000000 +00:00|287.79616907430096|
>> |1982-09-18 23:00:00.000000 +00:00|287.87181789646223|
>> +---------------------------------+------------------+
>
>
> I know that my example does not match your problem 100%. It is also
> time-consuming for you to establish the 1-band grid approach. It requires
> you to change your database structure. Sorry about that.
>
> Hopefully the comments will help you. You are also welcome to contact me
> personally.
>
> Best regards
> Tobias
>
> Am Fr., 17. Nov. 2023 um 05:32 Uhr schrieb Manaswini Ganjam via
> postgis-users <postgis-users at lists.osgeo.org>:
>
>> Regina,
>> Thank you for the suggestions, the code looks clean and the computation
>> was much better, but I still have the same issue, the values are
>> unrealistic, this happened to me before as well (when I tried to run
>> st_value for one raster band, with some x and y). Find the enclosed
>> screenshot.
>> [image: image.png]
>> I have test run the code for one prcp file and the prcp values are
>> negative, and the no data value for this dataset is -32768 (most of the
>> values are -32767 or -32750) and I have not preprocessed this data. Do we
>> need to manually unpack this data? If that is the case, how do I access the
>> scale_factor and add_offset attributes packed with netcdf?
>>
>> This information is from the climate data readme file:The data has been
>> packed into short integers (2 bytes instead of 4 byte reals) to save space.
>> You must unpack that data to get the correct floating point representation
>> of the data. Each netCDF variable that has been packed has an add_offset
>> and scale_factor attribute associated with it. Some software automatically
>> unpacks the data when it is read.
>> The formula to unpack the data is:
>> unpacked value = add_offset + ( (packed value) * scale_factor )
>>
>> Thank you,
>> Manaswini
>>
>>
>>
>>
>>
>>
>>
>> On Thu, 16 Nov 2023 at 22:02, Regina Obe <lr at pcorp.us> wrote:
>>
>>> One more optimization.  I think you can get rid of the FOREACH
>>> band_number too and reduce all that to this query
>>>
>>>
>>>
>>> So all this goes
>>>
>>>
>>>
>>>         band_numbers := ARRAY(SELECT generate_series(1, 366));
>>>
>>>         -- Loop through all bands within the file
>>>         FOREACH band_number IN ARRAY band_numbers LOOP
>>>
>>> :
>>>
>>> LOOP
>>>
>>>
>>>
>>> Use ST_NumBands instead of relying on your rasters all having 366 bands
>>> https://postgis.net/docs/RT_ST_NumBands.html
>>>
>>> So replace all the above with this:
>>>
>>>
>>>
>>>             INSERT INTO extracted_data_bbox (year, year_day, lat, lon,
>>> prcp, tmax, tmin, observation_time)
>>>             SELECT year_from_filename, n.band_number, ST_Y(sc.geom),
>>> ST_X(sc.geom), CASE WHEN variable_name = 'prcp' THEN sc.val ELSE NULL END,
>>>                     CASE WHEN variable_name = 'tmax' THEN sc.val ELSE
>>> NULL END,
>>>                     CASE WHEN variable_name = 'tmin' THEN sc.val ELSE
>>> NULL END, observation_time
>>>
>>>            FROM  generate_series(1,
>>> ST_NumBands(raster_record.clipped_raster) ) AS n(band_number),
>>> ST_PixelAsCentroids(raster_record.clipped_raster, n.band_number, true) AS
>>> sc;
>>>
>>>
>>>
>>> That will solve the issue of you going over the band count of your
>>> raster, which might be an issue you are running into
>>>
>>>
>>>
>>>
>>>
>>> *From:* Regina Obe <lr at pcorp.us>
>>> *Sent:* Thursday, November 16, 2023 9:46 PM
>>> *To:* 'Manaswini Ganjam' <manu.ganjam at gmail.com>
>>> *Cc:* 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org>
>>> *Subject:* RE: [postgis-users] Extracting variable information from
>>> netcdf, imported as raster to a table
>>>
>>>
>>>
>>> I don’t know much about netCDF but I would assume GDAL would handle the
>>> packing unraveling behind the scenes.
>>>
>>>
>>>
>>> Some things I notice wrong
>>>
>>>
>>>
>>> https://postgis.net/docs/RT_ST_PixelAsCentroids.html  is a set
>>> returning function, so you can’t just stuff it in a record variable.  It
>>> has to go in a table.
>>>
>>> So I’d get rid of this line
>>>
>>>
>>>
>>>           SELECT * FROM
>>> ST_PixelAsCentroids(raster_record.clipped_raster, band_number, true) INTO
>>> pixel_data;
>>>
>>> I’d also scrap this, because ultimately you want to work on all
>>> centroids not the first one that falls out of the tree
>>>
>>> EXECUTE format('SELECT ST_Value($1, $2, $3, true)',
>>> raster_record.clipped_raster, band_number, centroid_point) INTO
>>> variable_value USING raster_record.clipped_raster, band_number,
>>> centroid_point;
>>>
>>>
>>>
>>> I’d change this
>>>
>>>
>>>             -- Insert the extracted data into the extracted_data_bbox
>>> table
>>>             INSERT INTO extracted_data_bbox (year, year_day, lat, lon,
>>> prcp, tmax, tmin, observation_time)
>>>             VALUES (year_from_filename, band_number,
>>> ST_Y(centroid_point), ST_X(centroid_point), CASE WHEN variable_name =
>>> 'prcp' THEN variable_value ELSE NULL END,
>>>                     CASE WHEN variable_name = 'tmax' THEN variable_value
>>> ELSE NULL END,
>>>                     CASE WHEN variable_name = 'tmin' THEN variable_value
>>> ELSE NULL END, observation_time);
>>>
>>>
>>>
>>> To this:
>>>
>>>
>>>
>>>
>>>
>>>
>>> *            INSERT INTO extracted_data_bbox (year, year_day, lat, lon,
>>> prcp, tmax, tmin, observation_time)            SELECT year_from_filename,
>>> band_number, ST_Y(sc.geom), ST_X(sc.geom), CASE WHEN variable_name = 'prcp'
>>> THEN sc.val ELSE NULL END,                    CASE WHEN variable_name =
>>> 'tmax' THEN sc.val ELSE NULL END,                    CASE WHEN
>>> variable_name = 'tmin' THEN sc.val ELSE NULL END, observation_time*
>>>
>>> *           FROM  ST_PixelAsCentroids(raster_record.clipped_raster,
>>> band_number, true) AS sc;*
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Manaswini Ganjam <manu.ganjam at gmail.com>
>>> *Sent:* Thursday, November 16, 2023 1:19 PM
>>> *To:* Regina Obe <lr at pcorp.us>
>>> *Cc:* PostGIS Users Discussion <postgis-users at lists.osgeo.org>
>>> *Subject:* Re: [postgis-users] Extracting variable information from
>>> netcdf, imported as raster to a table
>>>
>>>
>>>
>>> Thank you, Regina,
>>>
>>>
>>>
>>> I apologize, I have shared the older version of my code, I have tried
>>> ST_pixelascentroid and ST_value and the issue was I messed up the
>>> parameters of centroids and that caused the errors now I corrected that as
>>> well. I think the code below is a good representation of my approach, I
>>> would like to iterate for every raster in a table, for every band and for
>>> every lat and lon and extract st value, and insert it to a table,
>>>
>>>
>>>
>>> psql:/home/manaswini/MEGA/bbgc_uw_rpackage/rproject/raster2pgsql/extract_table_queryyyy.sql:66:
>>> NOTICE:  Invalid band index (must use 1-based). Returning empty set for all
>>> the coordinates and bands. What could be the issue.
>>>
>>>
>>>
>>> Is this because my data is packed? the readme file of the data has this
>>> information: Details about the data packing:
>>>
>>> The data has been packed into short integers (2 bytes instead of 4 byte reals) to save space. You must unpack that data to get the correct floating point representation of the data. Each netCDF variable that has been packed has an add_offset and scale_factor attribute associated with it. Some software automatically unpacks the data when it is read.
>>>
>>>
>>>
>>> The formula to unpack the data is:
>>>
>>>
>>>
>>> unpacked value = add_offset + ( (packed value) * scale_factor )
>>>
>>>
>>>
>>> For more information see here:
>>>
>>> https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html
>>>
>>>
>>>
>>> There's also another attribute called "missing_value". In this case all the -32768 values you see are missing. Only the grid points outside the downscaling domain is given the missing data value.
>>>
>>>
>>>
>>> The packing saves a lot of space, that is why the data is packed.
>>>
>>>
>>>
>>>
>>>
>>> -- Create the precipitation_temperature_data table
>>> DROP TABLE IF EXISTS extracted_data_bbox;
>>> CREATE TABLE extracted_data_bbox (
>>>     id SERIAL PRIMARY KEY,
>>>     year integer,
>>>     year_day integer,
>>>     lat double precision,
>>>     lon double precision,
>>>     prcp double precision,
>>>     tmax double precision,
>>>     tmin double precision,
>>>     observation_time timestamp
>>> );
>>>
>>> -- Loop through all records in gfdl_03_bbox
>>> DO $$
>>> DECLARE
>>>     raster_record RECORD;
>>>     year_from_filename integer;
>>>     observation_time timestamp;
>>>     centroid_point geometry;
>>>     variable_name text;
>>>     variable_value double precision;
>>>     band_number integer;
>>>     band_numbers integer[];
>>>     pixel_data RECORD;
>>> BEGIN
>>>     FOR raster_record IN (SELECT * FROM gfdl_03_bbox) LOOP
>>>         SELECT regexp_replace(raster_record.filename,
>>> '.*_(\d{4})[^0-9]+', '\1')::integer INTO year_from_filename;
>>>
>>>         -- Determine the variable name based on the filename
>>>         IF raster_record.filename ~ 'prcp' THEN
>>>             variable_name := 'prcp';
>>>         ELSIF raster_record.filename ~ 'tmax' THEN
>>>             variable_name := 'tmax';
>>>         ELSIF raster_record.filename ~ 'tmin' THEN
>>>             variable_name := 'tmin';
>>>         ELSE
>>>             RAISE EXCEPTION 'Unknown variable in filename: %',
>>> raster_record.filename;
>>>         END IF;
>>>
>>>         band_numbers := ARRAY(SELECT generate_series(1, 366));
>>>
>>>         -- Loop through all bands within the file
>>>         FOREACH band_number IN ARRAY band_numbers LOOP
>>>             -- Print the band number to the PostgreSQL log
>>>             RAISE NOTICE 'Band Number: %', band_number;
>>>
>>>             observation_time := MAKE_DATE(year_from_filename, 1, 1) +
>>> (band_number - 1) * interval '1 day';
>>>
>>>             SELECT * FROM
>>> ST_PixelAsCentroids(raster_record.clipped_raster, band_number, true) INTO
>>> pixel_data;
>>>
>>>             -- Extract the centroid point from pixel_data
>>>             centroid_point := pixel_data.geom;
>>>
>>>             -- Extract the variable value based on the variable name
>>>             EXECUTE format('SELECT ST_Value($1, $2, $3, true)',
>>> raster_record.clipped_raster, band_number, centroid_point) INTO
>>> variable_value USING raster_record.clipped_raster, band_number,
>>> centroid_point;
>>>
>>>             -- Insert the extracted data into the extracted_data_bbox
>>> table
>>>             INSERT INTO extracted_data_bbox (year, year_day, lat, lon,
>>> prcp, tmax, tmin, observation_time)
>>>             VALUES (year_from_filename, band_number,
>>> ST_Y(centroid_point), ST_X(centroid_point), CASE WHEN variable_name =
>>> 'prcp' THEN variable_value ELSE NULL END,
>>>                     CASE WHEN variable_name = 'tmax' THEN variable_value
>>> ELSE NULL END,
>>>                     CASE WHEN variable_name = 'tmin' THEN variable_value
>>> ELSE NULL END, observation_time);
>>>         END LOOP;
>>>     END LOOP;
>>> END;
>>> $$;
>>>
>>> I have tried the following query to check values for a raster band : the
>>> output is again null
>>> psql:/home/manaswini/MEGA/bbgc_uw_rpackage/rproject/raster2pgsql/test_query_fff.sql:22:
>>> NOTICE:  Band: 20, X: 1, Y: 135, Value: <NULL>
>>>
>>> :
>>>
>>> DO $$
>>> DECLARE
>>>     band_number integer := 20; -- Replace with the desired band number
>>>     x_coord integer;
>>>     y_coord integer;
>>>     pixel_value double precision;
>>>     srid integer := 4326; -- Replace with the correct SRID for your data
>>> BEGIN
>>>     -- Loop through all x and y coordinates in the raster band
>>>     FOR x_coord IN (SELECT generate_series(1,
>>> ST_Width(rast.clipped_raster)) FROM gfdl_03_bbox AS rast WHERE
>>> rast.filename = 'prcp_03_2034.nc') LOOP
>>>         FOR y_coord IN (SELECT generate_series(1,
>>> ST_Height(rast.clipped_raster)) FROM gfdl_03_bbox AS rast WHERE
>>> rast.filename = 'prcp_03_2034.nc') LOOP
>>>             -- Get the pixel value at the current x and y coordinates
>>>             SELECT ST_Value(ST_SetSRID(rast.clipped_raster, srid),
>>> band_number, ST_SetSRID(ST_Point(x_coord, y_coord), srid)) INTO pixel_value
>>>             FROM gfdl_03_bbox AS rast
>>>             WHERE rast.filename = 'prcp_03_2034.nc';
>>>
>>>             -- Print or use the pixel value as needed
>>>             RAISE NOTICE 'Band: %, X: %, Y: %, Value: %', band_number,
>>> x_coord, y_coord, pixel_value;
>>>         END LOOP;
>>>     END LOOP;
>>> END;
>>>
>>> Thank you,
>>>
>>> Manaswini
>>>
>>>
>>>
>>>
>>>
>>> On Wed, 15 Nov 2023 at 20:42, Regina Obe <lr at pcorp.us> wrote:
>>>
>>> I didn’t realize that netCDF also has a vector component.
>>>
>>>
>>>
>>> https://gdal.org/drivers/vector/netcdf.html#vector-netcdf
>>>
>>>
>>>
>>> To read the vector component, you’d use the ogr_fdw extension to read
>>> that rather than postgis_raster extension.
>>>
>>>
>>>
>>> Details here -  https://github.com/pramsey/pgsql-ogr-fdw
>>>
>>>
>>>
>>> So perhaps that is what your python is doing reading the vector
>>> component.  I don’t know if netcdf mixes those in one data set or not since
>>> I have no experience using netcdf.
>>>
>>>
>>>
>>> I recall you said you are using the OSGeo Live 16 distribution.  I just
>>> checked the OSGeoLive 16 does include ogr_fdw
>>>
>>>
>>>
>>> So do
>>>
>>>
>>>
>>> CREATE EXTENSION ogr_fdw;
>>>
>>>
>>>
>>>
>>>
>>> The list of supported formats you can see with this query:
>>>
>>>
>>>
>>> SELECT name FROM unnest(ogr_fdw_drivers()) AS f(name) ORDER BY name;
>>>
>>>
>>>
>>> Which for osgeolive 16, I see netCDF listed
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Regina Obe <lr at pcorp.us>
>>> *Sent:* Wednesday, November 15, 2023 6:19 PM
>>> *To:* 'PostGIS Users Discussion' <postgis-users at lists.osgeo.org>
>>> *Cc:* 'Manaswini Ganjam' <manu.ganjam at gmail.com>
>>> *Subject:* RE: [postgis-users] Extracting variable information from
>>> netcdf, imported as raster to a table
>>>
>>>
>>>
>>> Just confirming some stuff, since I’m not completely following:
>>>
>>>
>>>
>>> Raster_record.rast column is of type raster correct?  IF so ST_X and
>>> ST_Y won’t work since those are for geometry types.
>>>
>>>
>>>
>>> Also ST_Value(raster_record.rast, band_number), won’t work either since
>>> that expects as input a geometry or x,y on the raster you want the value
>>> you.
>>>
>>> I would think you would have gotten an error with that, which makes me
>>> feel I’m missing something critical.
>>>
>>>
>>>
>>> If you want to extract all the pixels in a raster, you’d do something
>>> like https://postgis.net/docs/RT_ST_PixelAsPoints.html
>>>
>>>
>>>
>>> SELECT pp.x, pp.y, pp.val, ST_X(pp.geom) AS lon, ST_Y(pp.geom) AS lat
>>>
>>> FROM raster_record,
>>>
>>> ST_PixelAsPoints(raster_record.rast, 1) AS pp
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* postgis-users <postgis-users-bounces at lists.osgeo.org> *On
>>> Behalf Of *Manaswini Ganjam via postgis-users
>>> *Sent:* Wednesday, November 15, 2023 2:01 PM
>>> *To:* postgis-users at lists.osgeo.org
>>> *Cc:* Manaswini Ganjam <manu.ganjam at gmail.com>
>>> *Subject:* [postgis-users] Extracting variable information from netcdf,
>>> imported as raster to a table
>>>
>>>
>>>
>>> Hi,
>>>
>>> I have been trying to download s3 cloud stored gridded climate data and
>>> generate tables with variables, lat, lon and timestamp (year, yearday). To
>>> achieve this I used raster2pgsql and imported multiple netcdf files into a
>>> database table.
>>>
>>>
>>>
>>> Question: How to achieve the extraction of variables using postgis? I
>>> tried using ST_value, ST_pixelaspoints but I was getting errors, mainly due
>>> to the format in which netcdfs are stored in the database (the error says
>>> can't load some characters like 00E30100082...), I even tried changing
>>> the datatype to float but still did not work. I mean it is probably not
>>> simple like selecting a variable from the netcdf. I have enclosed my sql
>>> query below:
>>>
>>>
>>>
>>>   -- Iterate through all raster files in the table
>>>     FOR raster_record IN (SELECT * FROM gfdl_03_prcp) LOOP
>>>         -- Determine the year from the raster file name, assuming the
>>> format is 'prcp_03_1950.nc'
>>>         SELECT
>>>             regexp_replace(raster_record.filename, '.*_(\d{4})\.nc',
>>> '\1')::integer
>>>         INTO
>>>             year;
>>>
>>>         -- Calculate the start date of the year
>>>         year_start := (year || '-01-01')::date;
>>>
>>>         -- Determine if the year is a leap year
>>>         is_leap_year := EXTRACT(ISODOW FROM (year_start + interval '1
>>> year')) = 7;
>>>
>>>         -- Set the number of bands for the year (365 for non-leap years,
>>> 366 for leap years)
>>>         FOR band_number IN 1..(CASE WHEN is_leap_year THEN 366 ELSE 365
>>> END) LOOP
>>>             -- Calculate the observation_time using the year and band
>>> number
>>>             observation_time := year_start + (band_number - 1) *
>>> interval '1 day';
>>>
>>>             -- Extract X (lon) and Y (lat) coordinates from the raster
>>>             SELECT
>>>                 ST_X(raster_record.rast) AS lon,
>>>                 ST_Y(raster_record.rast) AS lat
>>>             INTO
>>>                 lon,
>>>                 lat;
>>>
>>>             -- Insert the lat, lon, prcp, and observation_time into the
>>> extracted_values table
>>>             INSERT INTO extracted_values (lat, lon, prcp,
>>> observation_time)
>>>             VALUES
>>>                 (lat, lon, ST_Value(raster_record.rast, band_number),
>>> observation_time);
>>>
>>>             -- Increment the counter
>>>             counter := counter + 1;
>>>
>>>             -- Commit the transaction periodically in batches
>>>             IF counter % batch_size = 0 THEN
>>>                 COMMIT;
>>>             END IF;
>>>         END LOOP;
>>>     END LOOP;
>>>
>>>
>>>
>>>
>>> The metadata for the two files is as follows:
>>>
>>>
>>>
>>> File from database:
>>>
>>> {'NC_GLOBAL#Conventions': 'CF-1.5',
>>>
>>>  'NC_GLOBAL#GDAL': 'GDAL 3.6.4, released 2023/04/17',
>>>
>>>  'NC_GLOBAL#history': 'Wed Nov 15 13:32:13 2023: GDAL CreateCopy( not_clipped_prcp.nc, ... )'}
>>>
>>> File before loading into the database:
>>>
>>> {'lat#units': 'degrees_north',
>>>
>>>  'lon#units': 'degrees_east',
>>>
>>>  'NC_GLOBAL#title': 'Daily statistically downscaled CMIP5 data for the United States and southern Canada east of the Rocky Mountains, version 1.0, realization 1, 0.1x0.1 degree spatial resolution.',
>>>
>>>  'NETCDF_DIM_EXTRA': '{time}',
>>>
>>>  'NETCDF_DIM_time_DEF': '{366,4}',
>>>
>>>  'NETCDF_DIM_time_VALUES': '{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14....362,363,364,365}',
>>>
>>>  'prcp#add_offset': '819.17499',
>>>
>>>  'prcp#long_name': 'daily precipitation accumulation',
>>>
>>>  'prcp#missing_value': '-32768',
>>>
>>>  'prcp#scale_factor': '0.025',
>>>
>>>  'prcp#units': 'mm',
>>>
>>>  'prcp#_FillValue': '-32768',
>>>
>>>  'time#units': 'days since 1952-1-1 0:0:0.0'}
>>>
>>>
>>>
>>> In case this information is useful: Previously I used python to extract variable information and generate a csv or table using this variable information, and the code is enclosed below. In the code I extracted variable values using lon = dataset.variables['lon'][:] and iterated for loops to write them all in csv.
>>>
>>> Python code:
>>>
>>> import netCDF4 as nc
>>>
>>> # Step 1: Read the NetCDF file
>>> filename = "/home/manaswini/prcp_03_1950.nc"
>>> dataset = nc.Dataset(filename)
>>> dataset.set_auto_mask(False)
>>> dataset.set_auto_scale(True)
>>>
>>>
>>>
>>> lon = dataset.variables['lon'][:]
>>> lat = dataset.variables['lat'][:]
>>> time = dataset.variables['time'][:]
>>> prcp = dataset.variables['prcp'][:]
>>>
>>>
>>>
>>> import numpy as np
>>> import csv
>>>
>>> # csv_buffer
>>> csv_buffer = open('output.csv', 'w', newline='')
>>> csv_writer = csv.writer(csv_buffer)
>>>
>>> # Iterate through grid points and write to CSV buffer
>>> for i in enumerate(lon):
>>>     for j in enumerate(lat):
>>>         for k in enumerate(time):
>>>          csv_writer.writerow([lat[j], lon[i], prcp[i][j][k], year[k],
>>> yearday[k]])
>>>
>>>
>>> # Close the CSV buffer
>>> csv_buffer.close()
>>>
>>>
>>>
>>> Thank you,
>>>
>>> Manaswini Ganjam
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Manaswini Ganjam
>>>
>>
>>
>> --
>> Manaswini Ganjam
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/postgis-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231117/113ce563/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 232418 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231117/113ce563/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 7895 bytes
Desc: not available
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20231117/113ce563/attachment-0001.png>


More information about the postgis-users mailing list