Random_Samples_Overlapping_Patterns

Version 1.0

Westerholt, Rene, 2016, "Random_Samples_Overlapping_Patterns", https://doi.org/10.7910/DVN/GNU8F1, Harvard Dataverse, V1

Learn about Data Citation Standards.

Contact Owner

Dataset Metrics

12 Downloads

Description	These datasets resemble an overlap of two patterns. Thereby, a range of configurations with respect to geographic scale is contained. Regarding the attribute values attached to the geometric points, we resemble the situation of spatial heterogeneity (i.e., differing means, but constant variance across the patterns). Overall, the aim of these datasets is to resemble the overlap of different phenomena within social media data, especially Twitter and similar kinds of feeds where users contribute fully autonomously. (2016-03-18)
Subject	Earth and Environmental Sciences; Computer and Information Science; Social Sciences
Notes	These data resemble different configurations of overlapping patterns. Thereby, the geographic scales of the patterns are varied, and possible scale differences are combined with each other. This has been done for 90 configurations, whereby random sampling (geometry as well as attributes) was done 100 times for each pattern configuration. Overall, 9.000 patterns are contained. The datasets contained in the uploaded ZIP file are organized into four folders: overlap_lws_non_rev, overlap_lws_rev, overlap_swl_non_rev and overlap_swl_rev. Therebey, swl is acronym to "small with large," while lws stands for "large with small" respectively. The reason for this denominator scheme is that, in case of lws, 23.8% of the points from the larger-scale sub-pattern interact with at least one point from the smaller-scale counterpart. With swl the same has been done, but vice versa. When the the scale differences become too large, the ratio of 23.8% is no longer achievable (for purely geometric reasons). In these cases, a closest fit solution has been seeked instead. The suffix "non_rev" meant that the attribute values were dispersed in a way such that the values increase from center towards pattern boundary. "rev" refers to a reversed attribute dispersal mechanism. Each single CSV file follows the following naming scheme: clust_rep_min_max. "rep" thereby refers to the repetition (1 - 100), "min" denotes the minimum distance (m) at which any point within the larger-scale pattern interacts, "max" refers to the maximum point spacing distance accordingly. Note that the scale of the small-scale pattern has been fixed to [1,10], only the scale of the large scale-pattern was varied. Within these files, the following columns are contained: The first column (no title) contains unique row IDs. The second column ("V1") contains the X-part of the geographic coordinates. The third column ("V2") contains the Y-part of the geographic coordinates. The fourth column ("V3") contains an integer describing the association to one of the two contained sub-patterns. 0 thereby means "small-scale," while any other number indicates association to the large-scale pattern. The fifth column ("vals") contains Gaussian random values. The values for the small-scale pattern have been drawn from N(250,150) while the values for the large-scale pattern have been drawn from N(750,150). The sixth and last column ("lag") contains the spatial lag of each observation. Thereby, the lag is adjusted to neighborhoods according to the larger-scale pattern (i.e., cut-off distance of "max"). The employed spatial weights were generated by an inverse distance weighting scheme.
License/Data Use Agreement	CC0 1.0

	1 File
	overlapping_patterns.zip ZIP Archive - 1.2 GB Published Mar 18, 2016 12 Downloads MD5: d98083bd7f5b918166fe0e4f8c87cf55	Preview "overlapping_patterns.zip" Access File File Access Public Download Options ZIP Archive Download Metadata Data File Citation EndNote XML RIS BibTeX

Citation Metadata

Persistent Identifier	doi:10.7910/DVN/GNU8F1
Publication Date	2016-03-18
Title	Random_Samples_Overlapping_Patterns
Author	Westerholt, Rene (Heidelberg University)
Point of Contact	Use email button above to contact. Westerholt, Rene (Heidelberg University)
Description	These datasets resemble an overlap of two patterns. Thereby, a range of configurations with respect to geographic scale is contained. Regarding the attribute values attached to the geometric points, we resemble the situation of spatial heterogeneity (i.e., differing means, but constant variance across the patterns). Overall, the aim of these datasets is to resemble the overlap of different phenomena within social media data, especially Twitter and similar kinds of feeds where users contribute fully autonomously. (2016-03-18)
Subject	Earth and Environmental Sciences; Computer and Information Science; Social Sciences
Topic Classification	Twitter Spatial Analysis Social Media Spatial Statistics Spatial Heterogeneity
Notes	These data resemble different configurations of overlapping patterns. Thereby, the geographic scales of the patterns are varied, and possible scale differences are combined with each other. This has been done for 90 configurations, whereby random sampling (geometry as well as attributes) was done 100 times for each pattern configuration. Overall, 9.000 patterns are contained. The datasets contained in the uploaded ZIP file are organized into four folders: overlap_lws_non_rev, overlap_lws_rev, overlap_swl_non_rev and overlap_swl_rev. Therebey, swl is acronym to "small with large," while lws stands for "large with small" respectively. The reason for this denominator scheme is that, in case of lws, 23.8% of the points from the larger-scale sub-pattern interact with at least one point from the smaller-scale counterpart. With swl the same has been done, but vice versa. When the the scale differences become too large, the ratio of 23.8% is no longer achievable (for purely geometric reasons). In these cases, a closest fit solution has been seeked instead. The suffix "non_rev" meant that the attribute values were dispersed in a way such that the values increase from center towards pattern boundary. "rev" refers to a reversed attribute dispersal mechanism. Each single CSV file follows the following naming scheme: clust_rep_min_max. "rep" thereby refers to the repetition (1 - 100), "min" denotes the minimum distance (m) at which any point within the larger-scale pattern interacts, "max" refers to the maximum point spacing distance accordingly. Note that the scale of the small-scale pattern has been fixed to [1,10], only the scale of the large scale-pattern was varied. Within these files, the following columns are contained: The first column (no title) contains unique row IDs. The second column ("V1") contains the X-part of the geographic coordinates. The third column ("V2") contains the Y-part of the geographic coordinates. The fourth column ("V3") contains an integer describing the association to one of the two contained sub-patterns. 0 thereby means "small-scale," while any other number indicates association to the large-scale pattern. The fifth column ("vals") contains Gaussian random values. The values for the small-scale pattern have been drawn from N(250,150) while the values for the large-scale pattern have been drawn from N(750,150). The sixth and last column ("lag") contains the spatial lag of each observation. Thereby, the lag is adjusted to neighborhoods according to the larger-scale pattern (i.e., cut-off distance of "max"). The employed spatial weights were generated by an inverse distance weighting scheme.
Producer	Westerholt, Rene (GIScience Research Group, Heidelberg University) http://www.geog.uni-heidelberg.de/personen/gis_westerholt_en.html
Production Date	2016-03-17
Production Location	Heidelberg
Distributor	Westerholt, Rene (GIScience Research Group, Heidelberg University) http://www.geog.uni-heidelberg.de/personen/gis_westerholt_en.html
Distribution Date	2016-03-18
Depositor	Westerholt, Rene
Deposit Date	2016-03-18
Data Type	Point Pattern

Dataset Terms

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Dataset Version	Summary	Contributors	Published on
No records found.

Edit File

This file has already been deleted (or replaced) in the current version. It may not be edited.

Restrict Access

Restricting limits access to published files. People who want to use the restricted files can request access by default. If you disable request access, you must add information about access to the Terms of Access field.

Learn about restricting files and dataset access in the User Guide.

Request Access

Enable access request

You must enable request access or add terms of access to restrict file access.

Terms of Access for Restricted Files

Save Changes

Edit Embargo

The selected file or files have already been published. Contact an administrator to change the embargo date or reason of the file or files.

Delete Files

The file will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Select File(s)

Please select one or more files.

Share Dataset

Share this dataset on your favorite social media networks.

Continue

Dataset Citations

Citations for this dataset are retrieved from Crossref via DataCite using Make Data Count standards. For more information about dataset metrics, please refer to the User Guide.

Sorry, no citations were found.

Restricted Files Selected

The selected file(s) may not be downloaded because you have not been granted access.

Download Options

The files selected are too large to download as a ZIP.

You can select individual files that are below the 15.0 GB download limit from the files table, or use the Data Access API for programmatic access to the files.

Select File(s)

Please select a file or files to be downloaded.

Restricted Files Selected

The restricted file(s) selected may not be downloaded because you have not been granted access.

Click Continue to download the files you have access to download.

Ineligible Files Selected

Some file(s) cannot be transferred. (They are restricted, embargoed, or not Globus accessible.)

Click Continue to transfer the elligible files.

Delete Dataset

Are you sure you want to delete this dataset and all of its files? You cannot undelete this dataset.

Delete Draft Version

Are you sure you want to delete this draft version? Files will be reverted to the most recently published version. You cannot undelete this draft.

Unpublished Dataset Private URL

Private URL can only be used with unpublished versions of datasets.

Unpublished Dataset Private URL

Are you sure you want to disable the Private URL? If you have shared the Private URL with others they will no longer be able to use it to access your unpublished dataset.

Delete Files

The file(s) will be deleted after you click on the Delete button.

Files will not be removed from previously published versions of the dataset.

Compute

This dataset contains restricted files you may not compute on because you have not been granted access.

Deaccession Dataset

Are you sure you want to deaccession? The selected version(s) will no longer be viewable by the public.

Deaccession Dataset

Are you sure you want to deaccession this dataset? It will no longer be viewable by the public.

Version Differences Details

Please select two versions to view the differences.

Version Differences Details

Version:
Last Updated:

Select File(s)

Please select a file or files for access request.

Select File(s)

Embargoed files cannot be accessed. Please select an unembargoed file or files for your access request.

Edit Tags

Select existing file tags or create new tags to describe your files. Each file can have more than one tag.

Request Access

You need to Sign Up or Log In to request access.

Dataset Terms

Please confirm and/or complete the information needed below in order to request access to files in this dataset.

This dataset is made available under the following terms. Please confirm and/or complete the information needed below in order to continue.

License/Data Use Agreement

Our Community Norms as well as good scientific practices expect that proper credit is given via citation. Please use the data citation shown on the dataset page.

Creative Commons CC0 1.0 Universal Public Domain Dedication. CC0 1.0

Preview Guestbook

Upon downloading files the guestbook asks for the following information.

Guestbook Name

Collected Data

Account Information

Package File Download

Use the Download URL in a Wget command or a download manager to download this package file. Download via web browser is not recommended. User Guide - Downloading a Dataverse Package via URL

Download URL

https://dataverse.harvard.edu/api/access/datafile/

Compute Batch

Clear Batch

Dataset	Persistent Identifier	Change Compute Batch

Compute Batch

Submit for Review

You will not be able to make changes to this dataset while it is in review.

Publish Dataset

Are you sure you want to republish this dataset?

By default datasets are published with the CC0-“Public Domain Dedication” waiver. Learn more about the CC0 waiver here.

To publish with custom Terms of Use, click the Cancel button and go to the Terms tab for this dataset.

Select if this is a minor or major version update.

Minor Release (1.1)

Major Release (2.0)

Publish Dataset

This dataset cannot be published until Harvard Dataverse is published by its administrator.

Return to Author

Return this dataset to contributor for modification.