Lecture 3
Slide 1
Hello again and welcome to COM 333 - and now our 3rd
lecture. Today we’re going to be talking about the brief history
of the Internet, and particularly the development of the World Wide Web,
and also introduce you to finding your way around the Internet.
Slide 2
Of course, we talked briefly about the Department of
Defense and the development of the ARPANET in our 1st lecture. As
you may recall, in the 1960’s the Department of Defense recognized the
need to link together those research institutions that were working on
DOD projects. And in particular they were interested in linking those
across the country so that they could share blueprints and other data on
satellite, submarine, etc. projects. Well the Internet continued
to grow in the 1970’s and was used largely by college and universities
and some other research institutions - engineering companies, defense contractors,
etc.. And in the 1980’s, the Internet began to pick up speed as more and
more universities joined the net. It continued to grow and it seemed
to be growing rather rapidly, but little did we know how much faster the
Internet in fact would grow. Here at UIS (back then Sangamon State
University) we used Bitnet, because at that time it was a part of the Internet,
and used that for communication for email and also accessed the Internet
for graphics and images. Well, beginning at the end of 1990 and 1991
and 1992, Tim Berners-Lee developed HTML - the hypertext markup language
- and Marc Andreeson - some great distance away (of course, Berners-Lee
in Geneva, Switzerland, and Andreeson in Urbana, Illinois), Andreeson developed
Mosaic, which was a graphical web browser using HTML. What was important
was that by the development of HTML and the simultaneous development of
a software tool that integrated HTML - one could see images real time off
the Internet, rather than downloading them. And so the Internet then
became much more than it had been in the past, and in fact, even now we
must realize the Internet is more than just the World Wide Web. Email
of course, which we talked about last week, is an important component of
the Internet. But now the World Wide Web is increasingly the largest
part of our Internet experience.
Slide 3
Well, before the web, File Transfer Protocol was the
mode of access and transfer of information across the World Wide Web and
largely those computers that were linked by the routers of the time were
unix-based systems, and they used certain protocols, and among them was
File Transfer Protocol. Stand-alone graphical viewers were used on
individual machines, on mini-computers and microcomputers, as well.
So that one would transfer an image, let’s say a satellite image from a
government site - the NOAA site or the national weather service to your
home computer. Then using a separate piece of software on your home
computer you would be able to then bring up the image. Now that process
took several minutes - that’s much longer than it currently does.
Modems of the time were transferring data at 300 baud or 1200 baud.
Baud is roughly translated into bits per second. Now, of course,
many computers have 33,000 bit per second or 56,000 bit per second capability
and with ISDN lines, 128,000 bit per second is not uncommon. Now
the cable modems transfer data at more than a million bits per second as
does ADSL - the Asymmetrical Digital Service Line - sometimes abbreviated
simply as DSL. We’re going to be hearing a lot more about cable modems
and DSL as we more through 1999. Well, the web itself was the beginning
of e-commerce - electronic commerce. The potential was envisioned
once the web was developed and Internet was no longer just the purview
for colleges, universities and researchers.
Slide 4
Commercial entities understood the potential of the Internet
and when they compared the Internet to newspapers, radio and television
for advertising they found the Internet was far less expensive. It
provided far more detailed data collection on those who visited the site
- and it allowed for built-in electronic ordering - it had so many features
that newspaper, radio and TV had only promised. Commerce now drives
the Internet and it really is the force that moves the Internet and World
Wide Web forward. Government, education and research, which really
were the founders of the Internet, now are moving to a new forum - NGI
and Internet2. NGI - Next Generation Internet, of course, is the
project, which is promoted by the US federal government. Internet2
is a project that is promoted by now about 150 colleges and universities.
And it is supported in a grant matching basis by the federal government.
But it’s important to note that these are two different initiatives - NGI
for the federal Government and Internet2 for colleges and universities.
Ultimately, Internet2 is to expand to include and embrace k-12 schools,
as well.
Slide 5
Finding your way around the Internet - well no one person
knows what all is on the Internet - it changes every hour, every minute,
every second, and there’s no complete index. You know years ago even
in the 1980’s when I was teaching classes on new technologies and we talked
about he Internet, we described it as going to the Library of Congress
with all of the books on the floor and no card catalog. The first
attempt came with a process called gopher, which was developed at the home
of the Golden Gophers -the University of Minnesota. The gopher system
was a non-graphical way of linking together locations on the Internet.
And then there were hotlists and hotlists proliferated. In fact,
a couple of grad students at Stanford developed one that finally outgrew
Stanford and ended up on it’s own as Yahoo!. Web crawlers ultimately
came about, such as AltaVista, Infoseek, HotBot, and there are so many
others. These web crawlers operate 24 hours a day, 365 days a year
and they search domains constantly, they use bot - short for robot
- electronic robots that search the Internet - going from domain to domain
and then they collect information from millions, actually tens of millions,
of sites, and yet each of these only cover a fraction of the Internet.
Slide 6
Well, how do these work? The web crawlers - they
search the text of web-pages, so they will look at the HTML code - virtually
look at the code - of the pages that are out there and the domains in which
they are searching and they use the “meta” descriptors that are included,
normally embedded in the head of the HTML code, and they store the results
in huge data banks. They apply their own algorithms, their own formulas,
to these data banks of information. For example, on a given page
- let’s say we were searching for the word baseball. It would look
at the pages and find those that included the word baseball. Then,
even among those pages, it would determine which of those pages had the
term baseball more often and which ones had baseball in larger font and
which ones had baseball at the top of the page and those kinds of considerations
figure into the algorithms which are used by the various web crawlers.
It takes about six to eight weeks for the web crawlers to cover all of
the domains that they cover on the Internet and then they continue, they
start all over again. Of course, individuals can submit pages for
inclusion directly to most of the web crawlers.
Slide 7
Well, as you use those web crawlers, you use commonly
searched terms. And the way in which you use those terms differ somewhat
with each and every web crawler or search engine. Many of them use
Boolean logic terms, you know, you may have studied these years ago - if,
and, or, not in logical expressions. But I had never heard of a logical
term called near, but near is used, in fact by AltaVista, for example.
You might be looking for the word Bill near Clinton. And that means
that Bill has to appear in the text within 10 words of the word Clinton.
So near is a useful term and it is used by some of the search engines.
Most of the search engines use double quotation marks to designate a phrase
- if you want those words in that specific order, and that helps you to
narrow your search if you put a particular phrase in quotation marks.
Generally one avoids uppercase - not always, but generally that will limit
one’s search. Now some search engines will search other than just
the World Wide Web. Some search engines include Usenet - news groups
which we’ll be talking about in a couple of weeks. Others include
even news wires and some of the latest news from Associated Press, United
Press, Reuters.
Slide 8
Well, perhaps the most exciting recent development has
been the development of the meta-search engines. No single search
engine can keep up with the growth of the Internet. And so meta-search
engines have become very, very important. The meta-crawlers are the
crawler of crawlers. That is, it’s a search engine that will then
search other search engines. So if you submit a search term to one
site, it re-submits that term to multiple other search engines and then
comes back with the results. Meta-crawlers speed up searches quite
a bit. They also provide more complete searches, and of course we
are going to be talking more about that, as you see on your syllabus, in
greater detail next week.
Slide 9
Well, now it’s important that you visit all those links
and we have some especially good ones this week. It’s time to go
to California to UC Berkeley and look at their super tutorial that’s provided
by the library at the University of California at Berkeley. This
is a great example of the way in which we use the Internet among universities.
UC Berkeley updates its tutorial every month and so even later this semester
if you re-visit that site, you’ll find new or additional information.
And you can find (that tutorial, by the way, is highly-rated across the
Internet) it may take you an hour or more to go through it, but you’ll
emerge a much better searcher, and it will really help you as you prepare
for your mid-term exercise and the final research project. Also,
check out that searching, sleuthing and sifting site at Sages College.
It is a wonderful site - I think you’re going to find nicely organized
information on how to find information on the Internet. The Internet
valley link is the one on the history of the Internet. I think you’re
going to find that one particularly graphical. It’s almost McCluenesque
in it’s visualization. And Hobbes timeline is not quite so visual,
but is a great standard and was updated in January. It’s a wonderful
timeline of the development of the Internet - you can see how the Internet
picks up speed year by year. And then, of course, make sure that
you respond to the question of the week. And I’ll talk to you again
next week.