Files, Database, eCommerce

Download Report

Transcript Files, Database, eCommerce

Files, Database, eCommerce
Pertemuan ke 10
Magister Teknik Elektro
Universitas Udayana
1 / 43
Managing Files
•
Managing Files: Basic Concepts.
–
Data is organized in a data storage hierarchy of
increasingly complex levels: bits, bytes (characters),
fields, records, files, and databases.
•
•
•
•
A character is a letter, number, or special character.
A field consists of one or more characters (bytes).
A record is a collection of related fields.
A file is a collection of related records. A database is, as
mentioned, an organized collection of integrated files.
Important to data organization is the key field, a field used
to uniquely identify a record so that it can be easily
retrieved and processed.
2 / 43
• Data storage hierarchy
– The ranked levels of data stored in a
computer: bits, bytes (characters), fields,
records, files, and databases.
– Why it’s important: Understanding the data
storage hierarchy is necessary to understand
how to use a database.
3 / 43
4 / 43
• Character
– A single letter, number, or special character.
Why it’s important: Characters—such as A, B,
C, 1, 2, 3, #, $, %—are part of the data
storage hierarchy.
– Use an ASCII chart from the Web such as
http://wls.wwco.com/ref/ascii.html to spell out
your name in binary. Don’t forget the space
between your first and middle name nor the
one between your middle and last name. How
many bits were required?
5 / 43
• Field
– Unit of data consisting of one or more characters
(bytes). An example of a field is your name, your
address, or your Social Security number.
– Why it’s important: A collection of fields makes up a
record. Also see key field.
• Record
– Definition: Collection of related fields. An example of
a record would be your name and address and Social
Security number.
– Why it’s important: Related records make up a file.
6 / 43
• File
– Collection of related records. An example of a
file is data collected on everyone employed in
the same department of a company, including
all names, addresses, and Social Security
numbers.
– Why it’s important: A file is the collection of
data or information that is treated as a unit by
the computer; a collection of related files
makes up a database.
– Often an unknown file’s extension will be the
only way of finding out what application
created it.
7 / 43
• Files are given names—filenames
– Filenames also have extension names, three-letter
additions such as .doc and .txt. Among the types of
files are the following.
• (1) Program files are files containing software instructions.
The two most important are source program files, which
contain instructions in the form written by the programmer,
and executable files, which contain instructions that tell a
computer how to perform a particular task.
• (2) Data files are files that contain data.
• (3) Other common files are ASCII files, which are text only;
image files for digitized graphics; audio files, which contain
digitized sound; animation/video files, used for conveying
moving images; and Web files, which are files carried over
the World Wide Web.
8 / 43
• Filename
– The name given to a file
– Why it’s important: Files are given names so
that they can be differentiated. Filenames also
have extension names. These extensions of
up to three letters are added after a period
following the filename—for example, the .doc
in Psychreport.doc is recognized by Microsoft
Word as the extension for "document."
Extensions are usually inserted automatically
by the application software.
9 / 43
• Program files
– Definition: Files containing software instructions.
Why it’s important: Contrast data files.
– For More Info: Below are the contents of an actual
program file named bmi. This program is written in the
Perl programming language, which is very popular
with system administrators, programmers, and Web
developers. The program asks the user for his or her
height and weight, and then calculates the person’s
BMI (Body Mass Index) based on those two inputs.
Finally, the program outputs the user’s BMI, along
with a statement indicating whether the user is normal
weight, overweight, or obese. (You can find many
pages relating to BMI on the WWW, some of which
have built-in BMI calculators that do the equivalent of
this Perl program.)
10 / 43
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
#! /usr/bin/perl
print "Enter your height in inches: ";
$height = <STDIN>;
print "Enter your weight in pounds: ";
$weight = <STDIN>;
$bmi = ($weight / $height / $height) * 703;
print "Your BMI is $bmi\n";
if ($bmi < 25) {
print "Normal weight: < 25 BMI\n";
}
elsif ($bmi < 30) {
print "Overweight: 25 to 29.9 BMI\n";
}
else {
print "Obese: 30 and above BMI\n";
}
11 / 43
• Two main ways in which a storage device
accesses stored data are sequential
access and direct access.
– Sequential storage means that data is stored
and retrieved in sequence, as is the case with
magnetic-tape storage.
– Direct access storage means that a
computer can go directly to the information
you want, as in a CD player; hard disks and
other types of disks are of this nature.
12 / 43
• Sequential storage
– Definition: Storage system whereby data is stored
and retrieved in sequence, such as alphabetically.
– Why it’s important: An inexpensive form of storage,
sequential storage is the only type of storage
provided by tape, which is used mostly for archiving
and backup. The disadvantage of sequential file
organization is that searching for data is slow.
Compare direct access storage.
13 / 43
Magnetic tape sequential storage system.
14 / 43
• Direct access storage
– Storage system that allows the computer to go
directly to the desired information. The data is
retrieved (accessed) according to a unique data
identifier called a key field. It also uses a file
allocation table (FAT), a hidden on-disk table that
records exactly where the parts of a given file are
stored.
– Why it’s important: This method of file organization,
used with hard disks and other types of disks, is ideal
for applications where there is no fixed pattern to the
requests for data—for example, in airline reservation
systems or computer-based directory-assistance
operations. Direct access storage is much faster than
sequential access storage.
15 / 43
• Whether on magnetic tape or disk, data
may be stored offline or online.
– Offline storage means that data is not
directly accessible for processing until the
tape or disk has been loaded onto an input
device.
– Online storage means that stored data is
randomly (directly) accessible for processing.
16 / 43
• Offline storage
– System in which stored data is not directly accessible
for processing until the tape or disk it’s on has been
loaded onto an input device.
– Why it’s important: The storage medium and data are
not under the direct, immediate control of the central
processing unit.
– In addition to online storage and offline storage, more
recently the term "near line storage" has come into
existence. Like online storage, near line storage is
directly accessible by the CPU. But like offline
storage, users may have to wait awhile before their
request for data is fulfilled.
17 / 43
•
A database management system (DBMS) consists of
programs that control the structure of a database and
access to the data. The benefits of databases are file
sharing, reduced data redundancy, improved data
integrity, and increased security.
Databases can be classified as four types.
•
–
–
–
–
–
(1) An individual database is a collection of integrated files
used by one person. It could be a personal information
manager, which helps people keep track of information they
use daily.
(2) A shared database, or company database, is shared by
users in one organization in one location.
(3) A distributed database is stored on different computers in
different locations connected by a client/server network.
(4) A public databank is a compilation of data available to the
public; many such databanks are Web sites.
The last three databases should have a database
administrator to coordinate activities and needs.
18 / 43
• Database management system (DBMS)
– Also called a database manager; software that
controls the structure of a database and access to the
data. Allows users to manipulate more than one file at
a time.
– Why it’s important: This software enables sharing of
data (same information is available to different users);
economy of files (several departments can use one
file instead of each individually maintaining its own
files, thus reducing data redundancy, which in turn
reduces the expense of storage media and
hardware); data integrity (changes made in the files in
one department are automatically made in the files in
other departments); security (access to specific
information can be limited to selected users).
19 / 43
Examples of Database Management Systems
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Oracle database
IBM DB2
Adaptive Server Enterprise
FileMaker
Firebird
Ingres
Informix
Microsoft Access
Microsoft SQL Server
Microsoft Visual FoxPro
MySQL
PostgreSQL
Progress
SQLite
Teradata
CSQL
OpenLink Virtuoso
Daffodil DB
20 / 43
• Individual database
– Collection of integrated files used by one person.
– Why it’s important: Microcomputer users can set up
their own individual databases using popular
database management software; the information is
stored on the hard drives of their personal computers.
Today the principal database programs are Microsoft
Access, Corel Paradox, and Lotus Approach. In
addition, types of individual databases known as
personal information managers (PIMs) can help users
keep track of and manage information used on a daily
basis, such as addresses, telephone numbers,
appointments, to-do lists, and miscellaneous notes.
Popular PIMs are Microsoft Outlook, Lotus Organizer,
and Act.
21 / 43
• Shared database
– Also called a company database; a database
shared by users in one company or
organization in one location. The organization
owns the database, which may be stored on a
server such as a mainframe. Users are linked
to the database via a local area or wide area
network; the users access the network
through terminals or microcomputers.
– Why it’s important: Shared databases, such
as those you find when surfing the Web, are
the foundation for a great deal of electronic
commerce, particularly B2B commerce.
22 / 43
• Public databank
– Compilation of data available to the public.
– Why it’s important: The public databank is one
of the basic types of database.
– Web Exercise: One public database available
on the Web is the Social Security Death
Index.
23 / 43
• Database administrator
– Person who coordinates all related activities and
needs for an organization’s database.
– Why it’s important: The DBA determines user access
privileges; sets standards, guidelines, and control
procedures; assists in establishing priorities for
requests; prioritizes conflicting user needs; and
develops user documentation and input procedures.
He or she is also concerned with security—setting up
and monitoring a system for preventing unauthorized
access and making sure that the system is regularly
backed up and that data can be recovered should a
failure or disaster occur.
24 / 43
•
Database Models. Databases can be organized in
four ways.
–
–
–
–
(1) In a hierarchical database, fields or records are arranged
in related groups resembling a family tree, with child (lowerlevel) records subordinate to parent (higher-level) records.
(2) A network database is similar to a hierarchical database
but each child record can have more than one parent record.
(3) A relational database relates, or connects, data in
different files through the use of a key field. Structured query
language is an easy-to-use computer language for making
queries to a relational database and for retrieving selected
records. One feature of most query languages is query by
example (QBE), which allows users to ask for information in a
relational database by using a sample record to define the
qualifications they want for selected records.
(4) An object-oriented database uses objects, software
written in small, reusable chunks, as elements within database
files. An object consists of data in any form and instructions on
the action to be taken on the data.
25 / 43
• Hierarchical database
– Database in which fields or records are
arranged in related groups resembling a
family tree, with child (lower-level) records
subordinate to parent (higher-level) records.
The parent record at the top of the database
is called the root record.
– Why it’s important: The hierarchical database
is one of the common database structures.
26 / 43
27 / 43
• Network database
– Database similar in structure to a hierarchical
database; however, each child record can
have more than one parent record. Thus, a
child record, which in network database
terminology is called a member, may be
reached through more than one parent, which
is called an owner.
– Why it’s important: The network database is
one of the common database structures.
28 / 43
29 / 43
• Relational database
– Common database structure that relates, or connects,
data in different files through the use of a key field, or
common data element. In this arrangement there are
no access paths down through a hierarchy. Instead,
data elements are stored in different tables made up
of rows and columns. In database terminology, the
tables are called relations (files), the rows are called
tuples (records), and the columns are called attributes
(fields). All related tables must have a key field that
uniquely identifies each row; that is, the key field must
be in all tables.
– Why it’s important: The relational database is one of
the common database structures; it is more flexible
than hierarchical and network database models.
30 / 43
31 / 43
• Structured Query Language (SQL)
– Standard language used to create, modify, maintain,
and query relational databases. Why it’s important:
SQL further simplifies database use.
– Historical Perspective: SQL is pronounced as
"sequel." How did this acronym get such an unlikely
pronunciation? The first structured query language
was developed by IBM in the 1970s; its product name
was "Sequel2.“
• Query by example (QBE)
– Feature of query-language programs whereby the
user asks for information in a database by using a
sample record to define the qualifications he or she
wants for selected records.
– Why it’s important: QBE further simplifies database
use.
32 / 43
• Object-oriented database
– Database that uses "objects," software written in
small, reusable chunks, as elements within database
files.
– An object consists of (1) data in any form, including
graphics, audio, and video, and (2) instructions on the
action to be taken on the data.
– Why it’s important: A hierarchical or network database
might contain only numeric and text data. By contrast,
an object-oriented database might also contain
photographs, sound bites, and video clips. Moreover,
the object would store operations, called methods, the
programs that objects use to process themselves.
33 / 43
•
Features of a Database Management System. A
database management system may have a number of
components.
–
–
–
–
–
(1) A data dictionary is a procedures document or disk file
that stores the data definitions or a description of the structure
of data used in the database.
(2) DBMS utilities are programs that allow you to maintain the
database by creating, editing, and deleting data, records, and
files.
(3) A report generator is a program for producing an
onscreen or printed document from all or part of a database.
(4) Different users are given different user access privileges,
as determined by the database administrator.
(5) A DBMS should have system recovery features, so the
database administrator can recover the contents of the
database in the event of hardware or software failure. Four
approaches are: mirroring, with two copies of the database in
different locations; reprocessing, in which the processing can
be redone from a known past point; roll forward, a variant on
reprocessing; and rollback, which is used to undo unwanted
changes to the database.
34 / 43
• Data dictionary
– File that stores data definitions and descriptions of
database structure. It may also monitor new entries to
the database as well as user access to the database.
Why it’s important: The data dictionary monitors the
data being entered to make sure it conforms to the
rules defined during data definition. The data
dictionary may also help protect the security of the
database by indicating who has the right to gain
access to it.
• DBMS Utilities
– Programs that allow the maintenance of databases by
creating, editing, and deleting data, records, and files.
Why it’s important: DBMS utilities allow people to
establish what is acceptable input data, to monitor the
types of data being input, and to adjust display
screens for data input.
35 / 43
• Report generator
– In a database management system, a program users
can employ to produce on-screen or printed-out
documents from all or part of a database.
– Why it’s important: Report generators allow users to
produce finished-looking reports without much fuss.
• Crystal Reports,
http://www.businessobjects.com/product/catalog/
crystalreports/ made by Crystal Decisions, is a
very popular commercial report generator.
36 / 43
•
Databases & the New Economy: ECommerce, Data Mining, & B2B
Systems.
– Databases underpin the so-called New
Economy of computer, telecommunications,
and Internet companies in three ways: ecommerce, data mining, and business-tobusiness (B2B) systems.
– E-commerce, or electronic commerce, is
the buying and selling of products and
services through computer networks; an
example is Amazon.com.
37 / 43
• E-commerce
– Electronic commerce; the buying and selling of products and
services through computer networks. Why it’s important: By
2003, total U.S. e-commerce sales to consumers are expected to
reach $108 billion, or 6% of consumer retail spending; online
shopping is growing even faster than the increase in computer
use, which has been fueled by the falling price of personal
computers.
• Data mining
– Computer-assisted process of sifting through and analyzing vast
amounts of data in order to extract meaning and discover new
knowledge.
– Why it’s important: The purpose of DM is to describe past trends
and predict future trends. Thus, data-mining tools might sift
through a company’s immense collections of customer,
marketing, production, and financial data and identify what’s
worth noting and what’s not.
– KDNuggets (short for "Knowledge Discovery Nuggets") provides
a wealth of resources on data mining and Web mining: a free enews letter; lists of publications; course descriptions; info on
companies and products; job listings; and much more.
38 / 43
• Data mining is the computer-assisted process of sifting
through and analyzing vast amounts of data in order to
extract meaning and discover new knowledge.
• Data mining begins with acquiring data and cleaning it of
errors to yield cleaned-up data and a version of it called
meta-data (which shows its origins and transformations),
which are then sent to a data warehouse, a special
database of cleaned-up data and meta-data.
• Three kinds of tools are used to perform data mining, or
finding and analyzing tasks: query and reporting tools,
multidimensional-analysis tools, and intelligent agents.
• Data mining is used in applications ranging from
marketing to health to science.
39 / 43
• Data warehouse
– A database containing cleaned-up data and
meta-data (information about the data).
Stored using high-capacity disk storage.
– Why it’s important: Data warehouses combine
vast amounts of data from many sources in a
database form that can be searched, for
example, for patterns not recognizable with
smaller amounts of data.
40 / 43
•
Business-to-business systems (B2B systems) allow
businesses to sell to other businesses, using the
Internet or private network to cut transaction costs and
increase efficiencies.
The Ethics of Using Databases: Concerns about
Accuracy & Privacy.
•
–
–
–
–
In morphing, a film image is altered pixel by pixel, so that the
image becomes something else. This manipulation of digitized
images and sounds raises some ethical issues.
Sound performances can be misrepresented, photos may be
manipulated, and video and TV images may be altered in
undetectable ways and all stored in a database.
Databases are also limited in accuracy and completeness,
since not all facts can be found in a database, nor are all data
items true.
In addition, databases raise several concerns about privacy.
Finally, those who own databases may be in a position to
monopolize information.
41 / 43
42 / 43
• Morphing
– Altering a film or video image displayed on a
computer screen pixel by pixel, or dot by dot.
– Why it’s important: Morphing and other
techniques of digital manipulation can
produce images that misrepresent reality.
43 / 43