[postgis-devel] [SoC] GSoC 2021 - Final Report: Implement pre-sorting methods before GiST index building

Han Wang hanwgeek at gmail.com
Mon Aug 23 06:58:08 PDT 2021


Hi Regina,

Thank you and Giuseppe for the help! It is really nice to share ideas and
communicate with the open source community! Hope that there will be more
opportunities for cooperation in the future.

Thanks,
Han

On Mon, Aug 23, 2021 at 2:29 PM Regina Obe <lr at pcorp.us> wrote:

> Great work Han. It was great having you in the meeting and I hope you like
> PostGIS enough to consider contributing in the future. J
>
> You’d be a great asset to our team.
>
>
>
> Thanks,
>
> Regina
>
>
>
> *From:* postgis-devel [mailto:postgis-devel-bounces at lists.osgeo.org] *On
> Behalf Of *Han Wang
> *Sent:* Monday, August 23, 2021 12:55 AM
> *To:* PostGIS Development Discussion <postgis-devel at lists.osgeo.org>;
> soc at lists.osgeo.org
> *Subject:* [postgis-devel] [SoC] GSoC 2021 - Final Report: Implement
> pre-sorting methods before GiST index building
>
>
>
> Hi all,
>
>
>
> Here is my final report about my GSoC project: Implement pre-sorting
> methods before GiST index building.
>
>
>
> *Abstract*
>
> GiST(Generalized Search Tree) is a generalization data structure of a
> variety of disk-based height-balanced search trees. Under the high-level
> API of GiST, structures like b-tree, r-tree can be implemented for data
> management. PostgreSQL defines a set of process function APIs for elements
> of the GiST index. Only with these function implementations can a data type
> be indexed and managed by a GiST structure. In large data scenarios,
> pre-sorting a batch of data fetched in memory may be a local approximation
> to the global sorting method. Recent PostgreSQL patch shows that it should
> speed up the build of a GiST index after some pre-sorting of the data which
> needs to be indexed. In one fork, the author replaces the GIST_OPTIONS_PROC
> with GIST_ORDER_PROC to try to define an order for data fetched in memory
> to sort in order to speed up the subsequent index building process. And I
> implemented pre-sorting methods in z-order pattern and Hilbert order
> pattern, Alos tested and compared pre-sorting methods on various data.
>
>
>
> *Links*
>
> ** *Code Base: https://github.com/postgis/postgis/pull/619
>
> * Wiki:
> https://trac.osgeo.org/postgis/wiki/ImplementSortingMethodsBeforeGistIndexBuilding
>
> * Test Results and Performance Comparison:
> https://docs.google.com/document/d/1_mY_F2hPDk3vmXH5PPp2z9BuQWt-ZMORk6KxtdVQ3HY/edit?usp=sharing
>
>
>
> *Conclusion*
>
> ** *The Morton/Hilbert hash function for pre-sorting can accelerate the
> process of index building process and reduce the time-consuming to
> one-third to one-fifth of the original
>
> * The query operation for different data in the same area demonstrates
> more stable performance
>
> * The query performance of pre-sorting is about 30% slower than the
> original
>
>
>
> *Future Work*
>
> ** *Try to figure out the reason for the slowness of the pre-sorting index
>
> * Implement a fast Morton/Hilbert hash function for n-dimension geometry
> objects
>
>
>
> Participating in the GSoC is a great experience for me. And thank you very
> much for my mentors, GSoCstaff and the community.
>
>
>
> Best regards,
>
> Han
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20210823/654b18af/attachment-0001.html>


More information about the postgis-devel mailing list