Semantic Web Book - Introduction

I had wanted to post the introduction to the Semantic Web book to the book's companion web site so that people could get a feel for what is in the book and its target audience.  Currently there is chapter 3 posted there but I don't control the site.  So, here is the introduction to The Semantic Web: A Guide to the Future of XML, Web services, and Knowledge Management.

 


Introduction

 

“The bane of my existence is doing things that I know the computer could do for me.

-Dan Connolly, “The XML Revolution”

Nothing is more frustrating than knowing you have previously solved a complex problem but not being able to find the document or note that specified the solution. It is not uncommon to refuse to rework the solution because you know you already solved the problem and don’t want to waste time redoing past work. In fact, taken to the extreme, you may waste more time finding the previous solution than it would take to redo the work. This is a direct result of our information management facilities not keeping pace with the capacity of our information storage.

Look at the personal computer as an example. With $1000 personal computers sporting 60- to 80-GB hard drives, our document storage capacity (assuming 1-byte characters, plaintext, and 3500 characters per page) is around 17 to 22 million pages of information. Most of those pages are in proprietary, binary formats that cannot be searched as plaintext. Thus, our predominant knowledge discovery method for our personal information is a haphazardly created hierarchical directory structure. Scaling this example up to corporations, we see both the storage capacity and diversity of information formats and access methods increase ten- to a hundredfold multiplied by the number of employees.

In general, it is clear that we are only actively managing a small fraction of the total information we produce. The effect of this is lost productivity and reduced revenues. In fact, it is the active management of information that turns it into knowledge by selection, addition, sequence, correlation, and annotation. The purpose of this book is to lay out a clear path to improved knowledge management in your organization using Semantic Web technologies. Second, we examine the technology building blocks of the Semantic Web to include XML, Web services, and RDF. Lastly, not only do we show you how the Semantic Web will be achieved, we provide the justifications and business case on how you can put these technologies to use for a significant return on investment.

Why You Should Read This Book Now

Events become interrelated into trends because of an underlying attractive goal, which individual actors attempt to achieve often only partially. For example, the trend toward electronic device convergence is based on the goal of packing related features together to reduce device cost and improve utility. The trend toward software components is based on the goal of software reuse, which lowers cost and increases speed to market. The trend of do-it-yourself construction is based on the goals of individual empowerment, pride in accomplishment, and reduced cost. The trend toward the Semantic Web is based on the goal of semantic interoperability of data, which enables application independence, improved search facilities, and improved machine inference.

Smart organizations do not ignore powerful trends. Additionally, if the trend affects or improves mission-critical applications, it is something that must be mastered quickly. This is the case with the Semantic Web. The Semantic Web is emerging today in thousands of pilot projects in diverse industries like library science, defense, medical, and finance. Additionally, technology leaders like IBM, HP, and Adobe have Semantic Web products available, and many more IT companies have internal Semantic Web research projects. In short, key areas of the Semantic Web are beyond the research phase and have moved into the implementation phase.

The Semantic Web dominoes have begun to tumble: from XML to Web services to taxonomies to ontologies to inference. This does not represent the latest fad; instead, it is the culmination of years of research and experimentation in knowledge representation. The impetus now is the success of the World Wide Web. HTML, HTTP, and other Web technologies provide a strong precedent for successful information sharing. The existing Web will not go away; the introduction of Semantic Web technologies will enhance it to include knowledge sharing and discovery.

Our Approach to This Complex Topic

Our model for this book is a conversation between the CIO and CEO in crafting a technical vision for a corporation. In that model, we first explain the concepts in clear terms and illustrate them with concrete examples. Second, we make hard technical judgments on the technologywarts and all. We are not acting as cheerleaders for this technology. Some of it can be better, and we point out the good, the bad, and the ugly. Lastly, we lay the cornerstones of a technical policy and tie it all together in the final chapter of the book.

Our model for each subject was to provide straightforward answers to the key questions on each area. In addition, we provide concrete, compelling examples to of all key concepts presented in the book. Also, we provide numerous illustrative diagrams to assist in explaining concepts. Lastly, we present several new concepts of our own invention, leveraging our insight into these technologies, how they will evolve, and why.

How This Book Is Organized

This book is composed of nine chapters that can be read either in sequence or as standalone units:

Chapter 1, What Is the Semantic Web This chapter explains the Semantic Web vision of creating machine-processable data and how we achieve that vision. Explains the general framework for achieving the Semantic Web, why we need the Semantic Web, and how the key technologies in the rest of the book fit into the Semantic Web. This chapter introduces novel concepts like the smart-data continuum and combinatorial experimentation.

Chapter 2, The Business Case for the Semantic Web. This chapter clearly demonstrates concrete examples of how businesses can leverage the Semantic Web for competitive advantage. Specifically, presents examples on decision support, business development, and knowledge management. The chapter ends with a discussion of the current state of Semantic Web technology.

Chapter 3, Understanding XML and Its Impact on the Enterprise. This chapter explains why XML is a success, what is XML, what is XML Schema, what are namespaces, what is the Document Object Model, and how XML impacts enterprise information technology. The chapter concludes with a discussion of why XML meta data is not enough and the trend toward higher data fidelity. Lastly, we close by explaining the new concept of semantic levels. For any organization not currently involved in integrating XML throughout the enterprise, this chapter is a must-read.

Chapter 4, Understanding Web Services. This chapter covers all aspects of current Web services and discusses the future direction of Web services. It explains how to discover, describe, and access Web services and the technologies behind those functions. It also provides concrete use cases for deploying Web services and answers the question “why use Web services?” Lastly, it provides detailed description of advanced Web service applications to include orchestration and security. The chapter closes with a discussion of grid-enabled Web services and semantic-enabled Web services.

Chapter 5, Understanding the Resource Description Framework. This chapter explains what is RDF, the distinction between the RDF model and syntax, its features, why it has not been adopted as rapidly as XML, and why that will change. This chapter also introduces a new use case for this technology called noncontextual modeling. The chapter closes with an explanation of data modeling using RDF Schema. The chapter stresses the importance of explicitly modeling relationships between data items.

Chapter 6, Understanding the Rest of the Alphabet Soup. This chapter rounds out the coverage of XML-related technologies by explaining XPATH, XSL, XSLT, XSLFO, XQuery, XLink, XPointer, XInclude, XML Base, XHTML, XForms, and SVG. Besides explaining the purpose of these technologies in a direct, clear manner, the chapter offers examples and makes judgments on the utility and future of each technology.

Chapter 7, Understanding Taxonomies. This chapter explains what taxonomies are and how they are implemented. The chapter builds a detailed understanding of taxonomies using illustrative examples and shows how they differ from ontologies. The chapter introduces an insightful concept called the Ontology Spectrum The chapter then delves into a popular implementation of taxonomies called Topic Maps and XML Topic Maps (XTM). The chapter concludes with a comparison of Topic Maps and RDF and a discussion of their complementary characteristics.

Chapter 8, Understanding Ontologies. This chapter is extremely detailed and takes a slow, building-block approach to explain what ontologies are, how they are implemented, and how to use them to achieve semantic interoperability. The chapter begins with a concrete business example and then carefully dissects the definition of an ontology from several different perspectives. Then we explain key ontology concepts like syntax, structure, semantics, pragmatics, extension, and intension. Detailed examples of these are given including how software agents use these techniques. In explaining the difference between a thesaurus and ontology, an insightful concept is introduced called the triangle of signification. The chapter moves on to knowledge representation and logics to detail the implementation concepts behind ontologies that provide machine inference. The chapter concludes with a detailed explanation of current ontology languages to include DAML and OWL and offers judgments on the corporate utility of ontologies.

Chapter 9, Crafting Your Company’s Roadmap to the Semantic Web. This chapter presents a detailed roadmap to leveraging the Semantic Web technologies discussed in the previous chapters in your organization. It lays the context for the roadmap by describing the current state of information and knowledge management in most organizations to a detailed vision of a knowledge-centric organization. The chapter details the key processes of a knowledge-centric organization to include discovery and production, search and retrieval, and application of results (including information reuse). Next, detailed steps are provided to effect the change to a knowledge-centric organization. The steps include vision definition, training requirements, technical implementation, staffing, and scheduling. The chapter concludes with an exhortation to take action.

This book is a comprehensive tutorial and strategy session on the new data revolution emerging today. Each chapter offers a detailed, honest, and authoritative assessment of the technology, its current state, and advice on how you can leverage it in your organization. Where appropriate, we have highlighted “maxims” or principles on using the technology.

Who Should Read This Book

This book is written as a strategic guide to managers, technical leads, and senior developers. Some chapters will be useful to all people interested in the Semantic Web; some delve deeper into subjects after covering all the basics. However, none of the chapters assume an in-depth knowledge of any of the technologies.

While the book was designed to be read from cover to cover in a building-block approach, some sections are more applicable to certain groups. Senior managers may only be interested in the chapters focusing on the strategic understanding, business case, and roadmap for the Semantic Web (Chapters 1, 2, and 9). CIOs and technical directors will be interested in all the chapters but will especially find the roadmap useful (Chapter 9). Training managers will want to focus on the key Semantic Web technology chapters like RDF (Chapter 5), taxonomies (Chapter 7), and ontologies (Chapter 8) to set training agendas. Senior developers and developers interested in the Semantic Web should read and understand all of the technology chapters (Chapters 3 to 8).

What's on the Companion Web Site

The companion Web site at www.wiley.com/compbooks/daconta contains the following:

Source code. The source code for all listings in the book are available in a compressed archive.

Errata. Any errors discovered by readers or the authors are listed with the corresponding corrected text.

Code appendix for Chapter 8. As some of the listings in Chapter 8 are quite long, they were abbreviated in the text yet posted in their entirety on the web site.

Contact addresses. The email addresses of the authors are available, as well as answers to any frequently asked questions.

Feedback Welcome

This book is written by senior technologists for senior technologists, their management counterparts, and those aspiring to be senior technologists. All comments, suggestions, and questions from the entire IT community are greatly appreciated. It is feedback from our readers that both makes the writing worthwhile and improves the quality of our work. I’d like to thank all the readers who have taken time to contact us to report errors, provide constructive criticism, or express appreciation.

I can be reached via email at mike@daconta.net or via regular mail:

Michael C. Daconta

c/o Robert Elliott

Wiley Publishing, Inc.

111 River Street

Hoboken, NJ 07030

 

Best wishes,

 

Michael Daconta

Sierra Vista, Arizona