PERPUSTAKAAN BIG

  • Beranda
  • Informasi
  • Berita
  • Bantuan
  • Area Pustakawan
  • Area Anggota
  • Pilih Bahasa :
    Bahasa Arab Bahasa Bengal Bahasa Brazil Portugis Bahasa Inggris Bahasa Spanyol Bahasa Jerman Bahasa Indonesia Bahasa Jepang Bahasa Melayu Bahasa Persia Bahasa Rusia Bahasa Thailand Bahasa Turki Bahasa Urdu
Image of Classification of geological borehole descriptions using a domain adapted large language model

Text

Classification of geological borehole descriptions using a domain adapted large language model

Hossein Ghorbanfekr - Nama Orang; Pieter Jan Kerstens - Nama Orang; Katrijn Dirix - Nama Orang;

Geological borehole descriptions contain detailed textual information about the composition of the subsurface. However, their unstructured format presents significant challenges for extracting relevant features into a structured format. This paper introduces GEOBERTje: a domain adapted large language model trained on geological borehole descriptions from Flanders (Belgium) in the Dutch language. This model effectively extracts relevant information from the borehole descriptions and represents it into a numeric vector space. Showcasing just one potential application of GEOBERTje, we finetune a classifier model on a limited number of manually labeled observations. This classifier categorizes borehole descriptions into a main, second and third lithology class. We show that our classifier outperforms a rule-based approach (by 30% on average), non-contextual Word2Vec embeddings combined with a random forest classifier (by 38% on average), and a prompt engineering method with large language models (i.e., GPT-4 (by 11% on average) and Gemma 2 (by 28% on average)). This study exemplifies how domain adapted large language models enhance the efficiency and accuracy of extracting information from complex, unstructured geological descriptions. This offers new opportunities for geological analysis and modeling using vast amounts of data.


Ketersediaan
229551.136Perpustakaan BIG (Eksternal Harddisk)Tersedia
Informasi Detail
Judul Seri
Applied Computing and Geoscience - Open Access
No. Panggil
551.136
Penerbit
Amsterdam : Elsevier., 2025
Deskripsi Fisik
15 hlm PDF, 2.295 KB
Bahasa
Inggris
ISBN/ISSN
2590-1974
Klasifikasi
551.136
Tipe Isi
text
Tipe Media
-
Tipe Pembawa
-
Edisi
Vol.25, February 2025
Subjek
Classification
Natural language processing
Borehole description
Large language model
Info Detail Spesifik
-
Pernyataan Tanggungjawab
-
Versi lain/terkait

Tidak tersedia versi lain

Lampiran Berkas
  • Classification of geological borehole descriptions using a domain adapted large language model
    Geological borehole descriptions contain detailed textual information about the composition of the subsurface. However, their unstructured format presents significant challenges for extracting relevant features into a structured format. This paper introduces GEOBERTje: a domain adapted large language model trained on geological borehole descriptions from Flanders (Belgium) in the Dutch language. This model effectively extracts relevant information from the borehole descriptions and represents it into a numeric vector space. Showcasing just one potential application of GEOBERTje, we finetune a classifier model on a limited number of manually labeled observations. This classifier categorizes borehole descriptions into a main, second and third lithology class. We show that our classifier outperforms a rule-based approach (by 30% on average), non-contextual Word2Vec embeddings combined with a random forest classifier (by 38% on average), and a prompt engineering method with large language models (i.e., GPT-4 (by 11% on average) and Gemma 2 (by 28% on average)). This study exemplifies how domain adapted large language models enhance the efficiency and accuracy of extracting information from complex, unstructured geological descriptions. This offers new opportunities for geological analysis and modeling using vast amounts of data.
    Other Resource Link
Komentar

Anda harus masuk sebelum memberikan komentar

PERPUSTAKAAN BIG
  • Informasi
  • Layanan
  • Pustakawan
  • Area Anggota

Tentang Kami

Perpustakaan Badan Informasi Geospasial (BIG) adalah sebuah perpustakaan yang berada di bawah Badan Informasi Geospasial Indonesia. Perpustakaan ini memiliki koleksi yang berkaitan dengan informasi geospasial, termasuk peta, data geospasial, dan literatur terkait. Selengkapnya

Cari

masukkan satu atau lebih kata kunci dari judul, pengarang, atau subjek

Donasi untuk SLiMS Kontribusi untuk SLiMS?

© 2025 — Senayan Developer Community

Ditenagai oleh SLiMS
Pilih subjek yang menarik bagi Anda
  • Batas Wilayah
  • Ekologi
  • Fotogrametri
  • Geografi
  • Geologi
  • GIS
  • Ilmu Tanah
  • Kartografi
  • Manajemen Bencana
  • Oceanografi
  • Penginderaan Jauh
  • Peta
Icons made by Freepik from www.flaticon.com
Pencarian Spesifik