Big data
Big data
part of the Data management and Parallel computing themes from CEGIS
Big data typically refers to data sets that are too large to be dealt with by traditional data-related software. It can be difficult to move around, to maintain, to search, and to use effectively.
Improved scientific instruments, increasing volume and lifecycle of sensors, and the passage of time all contribute to making the conveyance, cataloging, and analysis of large data sets more complex endeavors.
CEGIS research works to find efficient ways to access and process large and diverse datasets. We seek to optimize computation time and related storage requirements.
Publications
You will find here a sampling of publications. More are available and are being published throughout the year.
Check back often or view our custom search for more!
-
At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution
Stream bend geometry is linked to terrain features, hydrologic and ecologic conditions, and anthropogenic forces. Knowledge of the distributions of geometric properties of streams advances understanding of changing landscape conditions and associated processes that operate over a range of spatial scales. Statistical decomposition of sinuosity in natural linear features has proven a...AuthorsLarry Stanislawski, Barry J. Kronenfeld, Barbara P. Buttenfield, Ethan J. ShaversA guide to creating an effective big data management framework A guide to creating an effective big data management framework
Many agencies and organizations, such as the U.S. Geological Survey, handle massive geospatial datasets and their auxiliary data and are thus faced with challenges in storing data and ingesting it, transferring it between internal programs, and egressing it to external entities. As a result, these agencies and organizations may inadvertently devote unnecessary time and money to convey...AuthorsSamantha Arundel, Kevin G McKeehan, Bryan B Campbell, Andrew N. Bulen, Philip T. ThiemGeneralization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines
Large geospatial datasets must often be generalized for analysis and display at reduced scales. Automated methods including artificial intelligence and deep learning are being applied to this problem, but the results are often analyzed on the basis of limited and subjective measures. To better support automation, a project is underway to develop a robust Python toolkit for computing...AuthorsBarry J. Kronenfeld, Larry Stanislawski, Barbara P. Buttenfield, Ethan J. ShaversTransferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska
The National Hydrography Dataset (NHD) managed by the U.S. Geological Survey (USGS) is being updated with higher-quality feature representations through efforts that derive hydrography from 3DEP HR elevation datasets. Deriving hydrography from elevation through traditional flow routing and interactive methods is a complex, time-consuming process that must be tailored for different...AuthorsLarry V. Stanislawski, Nattapon Jaroenchai, Shaowen Wang, Ethan J. Shavers, Alexander Duffy, Philip T. Thiem, Zhe Jiang, Adam CamererHistorical maps inform landform cognition in machine learning Historical maps inform landform cognition in machine learning
No abstract available.AuthorsSamantha Arundel, Sinha Gaurav, Wenwen Li, David P. Martin, Kevin G McKeehan, Philip T. ThiemGeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning
The field of GeoAI or Geospatial Artificial Intelligence has undergone rapid development since 2017. It has been widely applied to address environmental and social science problems, from understanding climate change to tracking the spread of infectious disease. A foundational task in advancing GeoAI research is the creation of open, benchmark datasets to train and evaluate the...AuthorsWenwen Li, Sizhe Wang, Samantha Arundel, Chia-Yu HsuGeoAI and the future of spatial analytics GeoAI and the future of spatial analytics
This chapter discusses the challenges of traditional spatial analytical methods in their limited capacity to handle big and messy data, as well as mining unknown or latent patterns. It then introduces a new form of spatial analytics—geospatial artificial intelligence (GeoAI)—and describes the advantages of this new strategy in big data analytics and data-driven discovery. Finally, a...AuthorsWenwen Li, Samantha ArundelDeep learning detection and recognition of spot elevations on historic topographic maps Deep learning detection and recognition of spot elevations on historic topographic maps
Some information contained in historical topographic maps has yet to be captured digitally, which limits the ability to automatically query such data. For example, U.S. Geological Survey’s historical topographic map collection (HTMC) displays millions of spot elevations at locations that were carefully chosen to best represent the terrain at the time. Although research has attempted to...AuthorsSamantha Arundel, Trenton P. Morgan, Philip T. Thiem
CEGIS science themes
Theme topics home
Parallel computing
Big data
Parallel software
Parallel systems
You will find here a sampling of publications. More are available and are being published throughout the year.
Check back often or view our custom search for more!
All Big data publications
All Data management publications
All CEGIS publications
At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution
A guide to creating an effective big data management framework A guide to creating an effective big data management framework
Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines
Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska
Historical maps inform landform cognition in machine learning Historical maps inform landform cognition in machine learning
GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning
GeoAI and the future of spatial analytics GeoAI and the future of spatial analytics
Deep learning detection and recognition of spot elevations on historic topographic maps Deep learning detection and recognition of spot elevations on historic topographic maps
CEGIS - Denver, Colorado

CEGIS - Rolla, Missouri

Samantha T Arundel, PhD
Research Director
Senior Science Advisor
Ethan Shavers, PhD
CEGIS Section Chief/ Supervisory Geographer
Jung kuan (Ernie) Liu
Physical Research Scientist
Big data typically refers to data sets that are too large to be dealt with by traditional data-related software. It can be difficult to move around, to maintain, to search, and to use effectively.
Improved scientific instruments, increasing volume and lifecycle of sensors, and the passage of time all contribute to making the conveyance, cataloging, and analysis of large data sets more complex endeavors.
CEGIS research works to find efficient ways to access and process large and diverse datasets. We seek to optimize computation time and related storage requirements.
Publications
You will find here a sampling of publications. More are available and are being published throughout the year.
Check back often or view our custom search for more!
-
At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution
Stream bend geometry is linked to terrain features, hydrologic and ecologic conditions, and anthropogenic forces. Knowledge of the distributions of geometric properties of streams advances understanding of changing landscape conditions and associated processes that operate over a range of spatial scales. Statistical decomposition of sinuosity in natural linear features has proven a...AuthorsLarry Stanislawski, Barry J. Kronenfeld, Barbara P. Buttenfield, Ethan J. ShaversA guide to creating an effective big data management framework A guide to creating an effective big data management framework
Many agencies and organizations, such as the U.S. Geological Survey, handle massive geospatial datasets and their auxiliary data and are thus faced with challenges in storing data and ingesting it, transferring it between internal programs, and egressing it to external entities. As a result, these agencies and organizations may inadvertently devote unnecessary time and money to convey...AuthorsSamantha Arundel, Kevin G McKeehan, Bryan B Campbell, Andrew N. Bulen, Philip T. ThiemGeneralization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines
Large geospatial datasets must often be generalized for analysis and display at reduced scales. Automated methods including artificial intelligence and deep learning are being applied to this problem, but the results are often analyzed on the basis of limited and subjective measures. To better support automation, a project is underway to develop a robust Python toolkit for computing...AuthorsBarry J. Kronenfeld, Larry Stanislawski, Barbara P. Buttenfield, Ethan J. ShaversTransferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska
The National Hydrography Dataset (NHD) managed by the U.S. Geological Survey (USGS) is being updated with higher-quality feature representations through efforts that derive hydrography from 3DEP HR elevation datasets. Deriving hydrography from elevation through traditional flow routing and interactive methods is a complex, time-consuming process that must be tailored for different...AuthorsLarry V. Stanislawski, Nattapon Jaroenchai, Shaowen Wang, Ethan J. Shavers, Alexander Duffy, Philip T. Thiem, Zhe Jiang, Adam CamererHistorical maps inform landform cognition in machine learning Historical maps inform landform cognition in machine learning
No abstract available.AuthorsSamantha Arundel, Sinha Gaurav, Wenwen Li, David P. Martin, Kevin G McKeehan, Philip T. ThiemGeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning
The field of GeoAI or Geospatial Artificial Intelligence has undergone rapid development since 2017. It has been widely applied to address environmental and social science problems, from understanding climate change to tracking the spread of infectious disease. A foundational task in advancing GeoAI research is the creation of open, benchmark datasets to train and evaluate the...AuthorsWenwen Li, Sizhe Wang, Samantha Arundel, Chia-Yu HsuGeoAI and the future of spatial analytics GeoAI and the future of spatial analytics
This chapter discusses the challenges of traditional spatial analytical methods in their limited capacity to handle big and messy data, as well as mining unknown or latent patterns. It then introduces a new form of spatial analytics—geospatial artificial intelligence (GeoAI)—and describes the advantages of this new strategy in big data analytics and data-driven discovery. Finally, a...AuthorsWenwen Li, Samantha ArundelDeep learning detection and recognition of spot elevations on historic topographic maps Deep learning detection and recognition of spot elevations on historic topographic maps
Some information contained in historical topographic maps has yet to be captured digitally, which limits the ability to automatically query such data. For example, U.S. Geological Survey’s historical topographic map collection (HTMC) displays millions of spot elevations at locations that were carefully chosen to best represent the terrain at the time. Although research has attempted to...AuthorsSamantha Arundel, Trenton P. Morgan, Philip T. Thiem
CEGIS science themes
Theme topics home
Parallel computing
Big data
Parallel software
Parallel systems
You will find here a sampling of publications. More are available and are being published throughout the year.
Check back often or view our custom search for more!
All Big data publications
All Data management publications
All CEGIS publications
At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution At what scales does a river meander? Scale-specific sinuosity (S3) metric for quantifying stream meander size distribution
A guide to creating an effective big data management framework A guide to creating an effective big data management framework
Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines Generalization quality metrics to support multiscale mapping: Hausdorff and average distance between polylines
Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska Transferring deep learning models for hydrographic feature extraction from IfSAR data in Alaska
Historical maps inform landform cognition in machine learning Historical maps inform landform cognition in machine learning
GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning GeoImageNet: A multi-source natural feature benchmark dataset for GeoAI and supervised machine learning
GeoAI and the future of spatial analytics GeoAI and the future of spatial analytics
Deep learning detection and recognition of spot elevations on historic topographic maps Deep learning detection and recognition of spot elevations on historic topographic maps
CEGIS - Denver, Colorado

CEGIS - Rolla, Missouri
