Sayadasite: Organizing Data

Data hierarchy refers to the (Tabulation) organization of data, often in a hierarchical form. Data organization involves characters, fields, records, files and so on. This concept is a starting point when trying to see what makes up data and whether data has a structure.

Data hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves characters, fields, records, files and so on. This concept is a starting point when trying to see what makes up data and whether data has a structure. For example, how does a person make sense of data such as 'employee', 'name', 'department', 'Marcy Smith', 'Sales Department' and so on, assuming that they are all related? One way to understand them is to see these terms as smaller or larger components in a hierarchy. One might say that Marcy Smith is one of the employees in the Sales Department, or an example of an employee in that Department. The data we want to capture about all our employees, and not just Marcy, is the name, ID number, address etc.

Define organizing data

Data organization is the practice of categorizing and classifying data to make it more usable. Similar to a file folder, where we keep important documents, you'll need to arrange your data in the most logical and orderly fashion, so you — and anyone else who accesses it — can easily find what they're looking for.

Why is data organization important?

Good data organization strategies are important because your data contains the keys to managing your company’s most valuable assets. Getting insights out of this data could help you obtain better business intelligence and play a major role in your company’s success.

Which is a method of organizing the data?

Organization of data means classification, tabulation, graphical presentation and diagrammatic presentation of data. The methods that we use to organize data include classification, tabulation, graphical presentation and diagrammatic presentation.

What can you organize?

Your data is probably stored as one of the most common structure types. Tabular data are flat, rectangular files. This represents data that is currently stored in a spreadsheet. Most research data is stored in this structure.

Hierarchical files are typically xml files that are able to save data and metadata in the same file. This structure is used to avoid redundancies. Relational databases organize data in multiple tables, which can hold great quantities of data and handle complex queries.

In any good data organization strategy, understanding your data’s structure is key to unlocking its value. Data can be stories in two ways: structured or unstructured. 80 to 90 percent of the world’s data is unstructured — and that number is growing many times faster than its structured counterpart.

Data that is formatted, tagged, and organized in databases is referred to as structured. It can be easily accessed, processed, and analyzed.

What is Organization of Data? Mention various Methods for Organizing Data.

Classification of data refers to categorization of data. It includes the summary of the frequency of individual scores or ranges of scores for a variable. Data is grouped on the basis of their similarities.

The objectives of classification of data are to present it in a condensed form, to explain its affinities and diversities. Classification of data may be done on the basis of qualitative and quantitative aspects.

What is Survey Research?tabulation

Another method is tabulation of data. It is way to systematically arrange the data in rows and columns. The objective is to simplify the presentation and to facilitate comparisons keeping in view the objectives of the study.

The other technique is graphical presentation. Data is plotted on a pictorial platform formed of horizontal and vertical lines. The purpose is to provide a systematic way of “looking at” and understanding of the data.

Graphs can be polygon, chart or diagram. We can create a graph on two mutually perpendicular lines called the X and Y-axes.

Diagram is also used to present statistical data in simple, readily comprehensible form. Diagrammatic presentation is different form used only for presentation of the data in visual form, whereas graphic presentation of the data can be used for further analysis.

There are different forms of diagrams e.g., Bar diagram, Sub-divided bar diagram, Multiple bar diagram, Pie diagram and Pictogram.

After data is collected, classified and organized it is not always possible to mention every piece of data in a report.

Instead the researcher summarizes data by describing the whole data set using just a few numbers. Summarizing data also makes it easier to analyze the data later.

D Organizing Data

Organising your data

Research Data Management

Data Management Guide

Creating your data

Organising your data

Accessing your data

Looking after and sharing your data

Electronic Lab Notebooks

Examples of data sharing at the University of Cambridge

Support

Data Repository

Data Policies

Organising your data

Once you create, gather, or start manipulating data and files, they can quickly become disorganised. To save time and prevent errors later on, you and your colleagues should decide how you will name and structure files and folders. Including documentation (or 'metadata') will allow you to add context to your data so that you and others can understand it in the short, medium, and long-term.

Below you can find some guidance on:

Ø Naming and Organising Files

Ø Documentation and Metadata

Ø Managing References

Ø Organising E-mail

Naming and Organising files

Choosing a logical and consistent way to name and organise your files allows you and others to easily locate and use them. Ideally, the best time to think how to name and structure the documents and directories you create is at the start of a project.

Agreeing on a naming convention will help to provide consistency, which will make it easier to find and correctly identify your files, prevent version control problems when working on files collaboratively. Organising your files carefully will save you time and frustration by helping you and your colleagues find what you need when you need it.

How should I organise my files?

Whether you are working on a standalone computer, or on a networked drive, the need to establish a system that allows you to access your files, avoid duplication, and ensure that your data can be backed up, takes a little planning. A good place to start is to develop a logical folder structure. The following tips should help you develop such a system:

Use folders - group files within folders so information on a particular topic is located in one place

Adhere to existing procedures - check for established approaches in your team or department which you can adopt

Name folders appropriately - name folders after the areas of work to which they relate and not after individual researchers or students. This avoids confusion in shared workspaces if a member of staff leaves, and makes the file system easier to navigate for new people joining the workspace

Be consistent – when developing a naming scheme for your folders it is important that once you have decided on a method, you stick to it. If you can, try to agree on a naming scheme from the outset of your research project

Structure folders hierarchically - start with a limited number of folders for the broader topics, and then create more specific folders within these

Separate ongoing and completed work - as you start to create lots of folders and files, it is a good idea to start thinking about separating your older documents from those you are currently working on

Try to keep your ‘My Documents’ folder for files you are actively working on, and every month or so, move the files you are no longer working on to a different folder or location, such as a folder on your desktop, a special archive folder or an external hard drive

Backup – ensure that your files, whether they are on your local drive, or on a network drive, are backed up

Review records - assess materials regularly or at the end of a project to ensure files are not kept needlessly. Put a reminder in your calendar so you do not forget!

What do I need to consider when creating a file name?

Decide on a file naming convention at the start of your project.

Useful file names are:

Ø Consistent

Ø Meaningful to you and your colleagues

Ø Allow you to find the file easily.

It is useful if your department/project agrees on the following elements of a file name:

Vocabulary – choose a standard vocabulary for file names, so that everyone uses a common language

Punctuation – decide on conventions for if and when to use punctuation symbols, capitals, hyphens and spaces

Dates – agree on a logical use of dates so that they display chronologically i.e. YYYY-MM-DD

Order - confirm which element should go first, so that files on the same theme are listed together and can therefore be found easily

Numbers – specify the amount of digits that will be used in numbering so that files are listed numerically e.g. 01, 002, etc.

How should I name my files, so that I know which document is the most recent version?

Very few documents are drafted by one person in one sitting. More often there will be several people involved in the process and it will occur over an extended period of time. Without proper controls this can quickly lead to confusion as to which version is the most recent. Here is a suggestion of one way to avoid this:

Use a 'revision' numbering system. Any major changes to a file can be indicated by whole numbers, for example, v01 would be the first version, v02 the second version. Minor changes can be indicated by increasing the decimal figure for example, v01_01 indicates a minor change has been made to the first version, and v03_01 a minor change has been made to the third version.

When draft documents are sent out for amendments, upon return they should carry additional information to identify the individual who has made the amendments. Example: a file with the name datav01_20130816_SJ indicates that a colleague (SJ) has made amendments to the first version on the 16th August 2013. The lead author would then add those amendments to version v01 and rename the file following the revision numbering system.

Include a 'version control table' for each important document, noting changes and their dates alongside the appropriate version number of the document. If helpful, you can include the file names themselves along with (or instead of) the version number.

Agree who will finish finals and mark them as 'final.'

There are also numerous external resources that will offer you guidance on the best file naming conventions and you can find more information about them here.

Documentation and Metadata

To ensure that you understand your own data and that others may find, use and properly cite your data, it helps to add documentation and metadata (data about data) to the documents and datasets you create.

What are 'documentation' and 'metadata'?

The term 'documentation' encompasses all the information necessary to interpret, understand and use a given dataset or set of documents. On this website, we use 'documentation' and 'metadata' (data about data - usually embedded in the data files/documents themselves) interchangeably.

When and how do I include documentation/metadata?

It is a good practice to begin to document your data at the very beginning of your research project and continue to add information as the project progresses. Include procedures for documentation in your data planning.

There are a number of ways you can add documentation to your data:

Embedded documentation

Information about a file or dataset can be included within the data or document itself. For digital datasets, this means that the documentation can sit in separate files (for example text files) or be integrated into the data file(s), as a header or at specified locations in the file.

Examples of embedded documentation include:

Code, field and label descriptions

Descriptive headers or summaries

Recording information in the Document Properties function of a file (Microsoft)

Supporting documentation;

This is information in separate files that accompanies data in order to provide context, explanation, or instructions on confidentiality and data use or reuse.

Examples of supporting documentation include:

Working papers or laboratory books

Questionnaires or interview guides

Final project reports and publications

Catalogue metadata

Supporting documentation should be structured, so that it can be used to identify and locate the data via a web browser or web based catalogue. Catalogue metadata is usually structured according to an international standard and associated with the data by repositories or data centres when materials are deposited.

Examples of catalogue data are:

Ø Title

Ø Description

Ø Creator

Ø Funder

Ø Keywords

Ø Affiliation

Ø Digital Curtain Centre provides examples of disciplinary-specific metadata, which can be viewed here.

Tools for metadata tracking and data standards

ISA Tools - metadata tracking tools for life sciences

The open source ISA metadata tracking tools help to manage an increasingly diverse set of life science, environmental and biomedical experiments that employing one or a combination of technologies.

Built around the ‘Investigation’ (the project context), ‘Study’ (a unit of research) and ‘Assay’ (analytical measurement) general-purpose Tabular format, the ISA tools helps you to provide rich description of the experimental metadata (i.e. sample characteristics, technology and measurement types, sample-to-data relationships) so that the resulting data and discoveries are reproducible and reusable.

FAIRsharing - searchable portal of inter-related data standards, databases, and policies for life sciences

FAIRsharing is a curated, searchable portal of inter-related data standards, databases, and policies in the life, environmental, and biomedical sciences.

Managing References

Projects can last for months or years, and it is easy to lose track of which piece of information came from which source. It can be a challenge to have to reconstruct half of your citations in the scramble at the end of the project! Your future self may not remember everything that seems obvious in the present, so it is important to take clear notes about your sources.

What is 'reference management software'?

Reference management software helps you keep track of your citations as you work, and partially automates the process of constructing bibliographies when it is time to publish. The University of Cambridge also offers support and training on several referencing systems.

Who can help me with reference conventions and formats for my academic discipline or particular project?

Your departmental librarian will be able to help you pick the right format for references and will probably know about some useful search and management tools that you have not used before. Feel free to ask him/her for advice.
Additionally, your college librarian is also a very good resource and is there to help.
Find your departmental and college librarian on the University's Libraries Directory.

Organising e-mail

Most people now routinely send and receive lots of messages every day and as a result, their inbox can get very quickly overloaded with hundreds of personal and work-related email. Setting aside some time to organise your emails will ensure information can be found quickly and easily, and is stored securely.

Why should I organise my email?

Apart from the obvious frustration and time wasted looking for that email you remember sending to someone last month, email is increasingly used to store important documents and data, often with information related to the attachments within the email itself. Without the proper controls in place they can often be deleted by mistake. It is also important to remember that your work email comes under The Data Protection Act 1998 and the Freedom of Information Act 2000, so your emails are potentially open to scrutiny.

What are the first steps to organising my email?

If your emails have got out of control there are a number of immediate steps you can take to control the problem:

Archive your old emails. If you have hundreds of emails hanging around from more than a month ago, move them into a new folder called something like "Archive". You can always come back to these at a later date.

Now go through your remaining inbox email by email. If an email is useless, delete it. If not, ask yourself: is it "active" - is there a specific action you, or someone else, need to take, or do you just vaguely think it is worth keeping? If the latter, move it to the archive.

How can I ensure my emails remain organised?

Here are some general tips to ensure your email remains organised in the long term:

Delete emails you do not need. Remove any trivial or old messages from your inbox and sent items on a regular (ideally daily) basis.

Use folders to store messages. Establish a structured file directory by subject, activity or project.

Separate personal emails. Set up a separate folder for these. Ideally, you should not receive any personal emails to your work email account.

Limit the use of attachments. Use alternative and more secure methods to exchange data where possible (see ‘data sharing’ for options). If attachments are used, exercise version control and save important attachments to other places, such as a network drive.

Sayadasite

Multiple Ads

Search

Menu Bar

Organizing Data

No comments: