The the complex nature, dynamics and interrealtionships

The role of
BioDiversity in Informatics


We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

In the wake of
increased threats from deforestation, alteration in land use, species invasion,
soil degradation, pollution and climate change, the global community felt an
urgent need to address biodiversity as 
an important perspective of our lives. Global concerns  for the conservation of biodiversity led to
the convention on biological diversity, at the moment, 188 countries including

The convention on
biological diversity:

Conservation of biological Diversity

Sustainable use of its resources.

Fair and equitable share of the genetic
and other resources of biological diversity.

information is critical to a wide range of scientific, educational, and
governmental uses, and is essential to decision-making in many realms.
Initiatives to integrate data into viable resources for innovation in science,
technology, and decision-making are being developed as local, regional, and
global initiatives. A formidable challenge lying ahead is integration of these initiatives
into an organized, well-resourced, global approach to build and manage
biodiversity information resources through collaborative efforts. The existence
of biodiversity data resources from different fields of knowledge, available to
all interested, and the strong demand to integrate, synthesize, and visualize
this information for different purposes and by different end users is leading
to the development of a new field of research that can be termed biodiversity
informatics. This emerging field represents the conjunction of efficient use
and management of biodiversity information with new tools for its analysis and
understanding. This field has great potential in diverse realms, with
applications ranging from prediction of distributions of known and unknown
species This potential nonetheless remains largely unexplored, as this field is
only now becoming a vibrant area of inquiry and study. Since biodiversity
conservation is a multidisciplinary science, it seeks help and applies the principles
of many other disciplines such as ecology, taxonomy,  systamatics, biogeography,  geoinformatics,  molecular biology, population genetics,
philosophy, anthropology, sociology, information technology, economics etc. for
conservation biologist or biodiversity experts it is a challenge to preserve
the evolutionary potential and ecology viability of a vast array of
biodiversity, and preserve the complex nature, dynamics and interrealtionships
of natural system.

Roots of Biodiversity Informatics:

Australia has been a
leading country in biodiversity informatics. Since the mid-1970s, Australian
herbaria have been digitizing their data cooperatively. The Environmental
Resources Information Network (ERIN), was established in 1989 to provide
geographically-related environmental information for planning and decision-making.
Also in 1989, HISPID, Herbarium Information Standards and Protocols for
Interchange of Data, a standard format for interchange of electronic herbarium
specimen information, developed by a committee of representatives from all
Australian herbaria, was first published. ERIN’s experience set an example for
several other initiatives, such as Mexico’s Comisión Nacional para el
Conocimiento y Uso de la Biodiversidad (CONABIO), Costa Rica’s Instituto
Nacional de Biodiversidad (INBio), and Brazil’s Base de Dados Tropical (BDT).
INBio and CONABIO both became fully engaged in biodiversity informatics after
the exchange of experience with ERIN experts in 1993. In the early 1990s,
researchers from diverse fields of expertise held meetings, and began the Biodiversity
Information Network – Agenda 21 (BIN21) initiative. This group established what
was called a Special Interest Network with a Virtual Library. BIN21 was set up
as an informal, collaborative, distributed network consisting of a series of
participating “nodes” aiming at complementing existing or planned actions.
BIN21 was actively involved in discussions of the CHM to the CBD, and produced
a document proposing the structure being used today, composed of focal points and
thematic networks2. At the time, owing to technological limitations, what was
envisaged was creation of directories of people, institutions, and data
sources. In 1998, a research project was launched at the University of Kansas
Natural History Museum and Biodiversity Research Center: the Species Analyst (TSA).
TSA’s main objective was to develop standards and software tools for access to
world natural history collection and observation databases. This project was
one of the first networks to draw on distributed data sources from biological
collections worldwide, setting an example for other initiatives, which were
attracted by examples based on an associated modeling tool, the Genetic
Algorithm for Rule-set Prediction (GARP), originally developed by David
Stockwell (Stockwell and Noble, 1991).

 Global and Regional Efforts:

Several global and
regional efforts are aiming at organizing data stakeholders and making data available
for conservation and sustainable development research. Global Biodiversity
Information Facility GBIF had its genesis in a recommendation from a working
group of the Megascience Forum of the Organization for Economic Cooperation and
Development (OECD), although GBIF currently has no direct ties to OECD. GBIF
was founded in March 2001 and participation is open to any interested country,
economy, or recognized international organization that agrees to make
scientific biodiversity information available. Although GBIF intends eventually
to incorporate all levels of biodiversity data (molecules to ecosystems), its
initial phase is focusing on speciesand specimen-level information. Its work
program emphasizes four priority areas: (1) digitization of natural history
collection data, (2) data access and database interoperability, (3) electronic
catalogue of Nnames of known organisms, and (4) outreach and capacity building.
The GBIF data portal, which connects participant nodes and other data
providers, came on-line in February 2004, and by October 2004 was serving over
40 million records. Working with its partners, GBIF is promoting development
and adoption of standards and protocols for documenting and exchanging biodiversity
data. GBIF has partnered with the Catalogue of Life initiative to speed up
development of a global authority file for the approximately 1.75 million named
species on Earth, including synonyms and vernacular names. In addition, GBIF is
tackling several sociological and policy issues.

the biodiversity information Infrastructure:

Most biodiversity informatics
initiatives are focusing on species and specimen data as the first necessary
information component of a global comprehensive data network on biodiversity. Nonbiotic
environmental data and ecological data are increasingly being used in
biodiversity informatics for modeling distribution patterns of species and
populations, and therefore also need to be addressed.

Biodiversity Data:

Specimen collections
are the primary research archives documenting biological diversity on Earth. The
2.5-3.0 billion specimens available in biological collections worldwide
document identities, habitats, histories, and spatial distributions of the
roughly 1.75 million described species of life. The specimen vouchers and
associated information provide a fundamental resource for biological
systematics. Return on investments made during 250 years of global biological
inventories can be realized dramatically through digitization and integration
of information about species and specimens. Less than 10% of worldwide specimens
are available in the electronic domain.



Access to consistent,
scientifically credible taxonomic information is essential to many activities,
including natural resource and waste management for sustainable use,
environmental monitoring, regulation, and biotechnology development. Storage
and retrieval of biological data requires high-quality, well-documented, and
continuously updated sources of taxonomic information. A basic concept in
databasing is the use of controlled vocabularies to assure that one can find,
relate, and retrieve information that refers to a particular class of objects.
This seemingly simple concept, of course, may become a nightmare when
databasing specimen data. As such, prompt access to digital authoritative data
on taxonomic and vernacular names becomes critical. For broader applications,
information systems integrating disparate biodiversity databases worldwide need
a core dictionary of names for mediating queries across datasets—the species
name may often be the only field in common between databases. The niche for
nomenclatural data and taxonomic authority files has brought numerous taxonomic
databases onto the Internet, including local fauna and flora checklists, and
“taxon-bytaxon” databases such as the Integrated Taxonomic Information System

Environmental Data:

Efforts to improve
understanding of environmental patterns, their variability, their changes over
time, and their implications for human welfare and decision-making, depend critically
on the quality, accessibility, and usability of diverse environmental and
related social science data. A key scientific and technical challenge,
therefore, is improving access to existing and emerging sources of
environmental, biological, and socioeconomic data, and to improve integration
of these data in support of disciplinary and interdisciplinary research efforts
and applications, and related policy-making initiatives. Terrestrial environmental
data fall into three basic categories: terrain, climate, and substrate, and all
are fundamental in ecological niche modeling. Terrain refers to surface
morphology and includes parameters such as elevation, slope, and aspect. Climate
data summarize patterns and variation in atmospheric characteristics, and
substrate data include soils, lithology, surface geology, hydrology, and
landforms. Climate change projections constitute another key suite of
environmental datasets. For many regions, they are only available at global
scales (Intergovernmental Panel on Climate Change, IPCC14), and hence may not
have the precision required for modeling at local scales. Integrating regional
climate models (RCMs) may represent a promising frontier to be explored. Demand
for environmental data by the biological community involved with ecological niche
modeling is quite recent. It is important to enable open access to data by
sharing data formats and by developing open-source tools for data conversion,
visualization, and analysis. Challenges inherent are in managing data resources
effectively for optimal access and use, and for developing rational rules and
structures for that process. It is also important to address technical aspects concerning
interoperability of environmental data across software and hardware systems,
and, in the case of niche modeling, to develop tools for automated dataset


Use of biodiversity
data in biogeographic studies has imposed an extra focus on issues of data quality.
When digitizing specimen data, data quality considerations include errors in
taxonomic identification, geocoding, and in the transcription process itself.
Errors are common and are to be expected, but cannot be ignored. Good understanding
of errors and error propagation can lead to active quality control and
management improvement. The heterogeneous origin of the distributed biodiversity
databases makes quality control even more important. Procedures must be used to
detect and flag errors or potential errors. This problem becomes even more
crucial when one considers all digitization processes being carried out by most
museums and herbaria in the world, with legacy data covering the past 100-300
years. Historical data cannot be replaced by new surveys even if necessary funds
were available, owing to loss of biodiversity and habitat changes and the
unique nature of each organism that constitutes a specimen. Geocoding historical
data can also be very complex, and is an additional potential source of errors Emerging
web-based tools for validating georeferences, taxonomic identifications, and collection
dates (or at least flagging records with high probabilities of error) are
leading to development of complex automated data-validation tools. The need for
tools capable of detecting geographic or ecological outliers, incorrectly georeferenced
localities, and misidentified specimens, such that doubtful records are flagged
for later checking by examination of specimens, is great. New tools will
provide users with summary files that can easily be linked with master
collections databases to update database records. The speciesLink and ORNIS
projects are developing a number of data cleaning tools that are being tested and
evaluated by scientific collections


In general, existing
biodiversity data do not provide sufficient coverage for direct, detailed environmental
decisions. Modeling—or some inferential step is thus needed for identifying and
filling data gaps, planning future research, assessing conservation priorities,
and providing information for environmental decisions. Modeling ecological niches
for prediction of geographic distributions of species is a growing field in
large-scale ecology and biodiversity informatics Many modeling tools and
techniques can be used for ENM, such as BIOCLIM (Nix, 1986), generalized linear
models (GLM; Austin et al.1994), generalized additive models (GAM; Yee and Mitchell,
1991), regression and classification tree analyses (CART; Breiman et al. 1984),
genetic algorithms (Stockwell and Peters, 1999), and artificial neural networks
(ANN; Olden and Jackson 2002; Pearson et al. 2002), among others. Which method
one uses may depend on the number of points available, type of environmental
variables, availability of absence data, purpose to which the model is going to
be put, and personal preferences and experience.

Comparisons between
techniques do exist (Thuiller, 2003; Manel et al., 1999b), but are hard to accomplish.
Apart from explicit differences among algorithms, each algorithm is usually
implemented by a different tool which has its own restrictions regarding data
input and output. When performing comparisons, one should preferably ensure
that each algorithm runs under the same conditions and using the same input.
However, recent efforts are developing generic frameworks to support development
and testing of modeling algorithms. BIOMOD (Thuiller, 2003) and openModeller38
have followed this approach. BIOMOD provides four techniques to predict spatial
distributions (GLM, GAM, CART and ANN), and includes accuracy testing for each
result. Open Modeller is an open source library being developed as part of the species
Link project. It is entirely based on open source software to accomplish tasks
like reading different map file formats, converting between coordinate systems,
and performing calculations. The current package includes several ENM algorithms
(BIOCLIM, Climate Space Model, GARP, and Euclidean distance techniques), and includes
a SOAP and a command line interface, and a desktop inteface is available as a
plugin for the QuantumGIS project. These initiatives are providing researchers
with the required tools to compare modeling methodologies easily, and to spend
more time analyzing and interpreting results. Algorithm developers can
concentrate on algorithm logic when using frameworks that take care of handling
input data and making projections. Moreover, in the near future, generic
libraries like open Modeller will be able to perform tasks in a distributed
fashion, including running analyses separately in remote cluster processors via
web services or Grid paradigms.


Innovation and
infrastructure developments will greatly reduce long-term data capture costs in
the broader biodiversity community. Modular, configurable, open-source Web
services will provide interoperability and scalability in distributed environments.
By wrapping image processing, image-to-text conversion, and data markup capabilities
into distributed, interoperable web services, greater efficiency, portability,
and scalability will be achieved. It is expected that before the end of this
decade, worldwide natural history collections will be contributing hundreds of millions
specimen records into Internet-accessible data servers. Good scientific
information is fundamental for sound environmental decision making, and design
of mechanisms to link scientific research to the decision-making process is no
easy matter (Reid, 2004). Biodiversity informatics will directly benefit
environmental education programs, resource management, conservation, and
biomedical and agricultural research.

Development of
interfaces with global environmental initiatives will be fundamental to promote
coordination and avoid duplication of efforts. In July 2003, the Earth
Observation Summit was held at Washington, D.C. (USA), with the goal of
promoting development of a comprehensive, coordinated, and sustained Earth
observation system among governments and the international community to
understand and address global environmental and economic challenges. As an immediate
result, an ad hoc Group on Earth Observations (GEO) was established to prepare
a 10-year implementation plan for building such a system.



Chavan VS, Ingwersen P BMC
Bioinformatics. 2009 Currently primary scientific data, especially that dealing
with biodiversity, is neither easily discoverable nor accessible. Amongst
several impediments, one is a lack of professional recognition of scientific
data publishing efforts. A possible solution is establishment of a ‘Data
Publishing Framework’ which would encourage and recognise investments and
efforts by institutions and individuals towards management, and publishing of
primary scientific data potentially on a par with recognitions received for
scholarly publications.

Increasing the profile of crop
diversity in Uzbekistan 24
Jul 2017 The research showed that farmers are the key suppliers of selected
target crops originating in central Asia. However, they do not always have
access to high quality nor improved varieties of seeds. The connections with
research organizations and seed quality control agencies through which farmers
obtained information and materials were also studied and shown to be rather
weak. Thus, an important aim was to better these connections and introduce
farmers to larger varieties of food crops to raise the profile of fruit and
vegetable species. The outcome was an average increase of 30% in the number of
propagated varieties for apple and apricot, and up to 45% – depending on
project site – in the quantity of high quality samplings of target fruit tree

Promoting agricultural
biodiversity on the mountains of  Nepal 14
Jul 2017 A valuable resource aimed at researchers, development professionals,
planners, field staff as well as farmers, variety catalogues are very useful in
agriculture research and development, and help empower communities to restore
and sustainably manage their own landscapes to meet their needs. They provide
easy access for farmers, seed producers, suppliers and extension officials to
pertinent information, increasing demand for and encouraging the growing of
different species and varieties, promoting a healthy and productive food
system. Incorporating agricultural biodiversity into farms will thus provide
nutritious diets and generate income, creating benefits for the farmers, consumers
and the environment, while safeguarding this precious, and threatened, resource
for future generations.

Historic seed samples to monitor
genetic variation 29
Jun 2017 All this valuable information was made available on the Bioversity
International website in 2014 in the Collecting Missions Database and is now
set to be accessed by a much higher number of people as the data is now
published on the Global Biodiversity Information Facility (GBIF), the biggest
biodiversity database on the Internet. GBIF is an open data infrastructure
helping institutions to publish their data according to common standards and
provides a single point of access to hundreds of millions of records.

 Integrated Landscape
Initiatives in South and Southeast Asia 16 Jun 2017 The survey showed that ILIs
in South and Southeast Asia, just like those in Latin America and Africa, tend
to be most commonly motivated by a need for conservation and the sustainable
use of natural resources, followed by agricultural development. Bioversity
International’s Camilla Zanzanaini, co-author of the publication, reminds us to
note a covert reasoning behind such results: “Conservation groups tend to push
for ‘integrated’ landscape initiatives in order to convince more communities
and stakeholders who are usually not interested in conservation alone (but who
clearly affect the resources within the landscape), to commit to creating more
sustainable landscapes.”

 Bisby FA:et al. (2000) Informatics is
the study of the processing, management and retrieval of information.
Bioinformatics is highly interdisciplinary field of science which allows
interpretation of biological data in a meaningful context.This domain of
informatics will be useful for the professionals especially those working in
research libraries of biological sciences. The key components of how to deal
with or use informatics techniques in order to manage biodiversity knowledge
are discussed with a prototype of Global Biodiversity Information Facility
(GBIF). Conservation of Biodiversity needs an integrated compilation of local
and global biodiversity data which can be achieved through biodiversity

Peterson AT & Vieglais D (2001). Information on geographic distributions in the form of
primary point occurrence data (Peterson et al. n.d.) is harvested from new
biodiversity information sources, niches of species are modeled in ecological
space, and niches are projected onto potentially invaded landscapes. The
advantage of this modeling procedure is that the possibility of an invasion can
be assessed before the actual introduction of the species, as is illustrated
herein by means of four case studies. Given that introductions and the negative
effects of a particular invasion are difficult to predict, we outline a way to
build biota-wide sets of projections to examine risks of species invasions for
all species from a particular region. Thus the reactive nature of current
solutions is replaced with a proactive, predictive approach.















Austin, M.P., A.O
Nicholls, M.D., Doherty and J.A Meyers. 1994. Determining species response functions
to an environmental gradient by means of a beta function. J. Veg. Sci.

Beard, C.B., G. Pye,
F.J. Steurer, Y. Salinas, R. Campman, A.T. Peterson, J.M. Ramsey, R.A. Wirtz and
L.E. Robinson. 2002. Chagas disease in a domestic transmission cycle in
southern Texas, USA. Emerging Inf. Dis. 9:103-105.

Berendsohn, W.,
A.Güntsch and D. Röpert. 2003. Survey of existing publicly distributed
collection management and data capture software solutions used by the world’s
natural history collections. Global Biodiversity Information Facility,

Bisby, F.A. 2000. The
quiet revolution: Biodiversity informatics and the Internet. Science.

Bisby, F.A., R. Froese,
M.A. Ruggiero and K.L. Wilson. 2004. Species2000 & ITIS Catalogue of Life: Indexing
the world’s known species (CD-ROM). Species2000. Los Baños, Philippines.

 Breiman, L., J.H. Friedman, R.A. Olshen and
C.J. Stone. 1984. Classification and regression trees. Chapman and Hall, New

Canhos, D.A.L., A.D.
Chapman and V.P. Canhos. 2004. Study on data-sharing with countries of origin. Global
Biodiversity Information Facility, Copenhagen.

Canhos, D.A.L, P. Uhlir
and J.M. Esanu (editors). 2004. Access to Environmental Data: Summary of an
Inter- American Corkshop. Committee on Data for Science and Technology, Paris.

Canhos, V.P., D.A.L.
Canhos, S. Souza, M.F. Siqueira, M. Muñoz, R. Giovanni, A. Marino, I. Koch,
R.L. Fonseca, C.Y. Umino, B. Cruz and A.P.S Albano. 2004. Sistema de informação
distribuído para coleções biológicas: A integração do Species Analyst e
SinBiota. Relatório Técnico Anual. FAPESP, São Paulo, Brazil.