dg.o2003 to Display Extraordinary Diversity
While most of the controls are text-oriented, Kules says he wants to expand the interface to apply information visualization ideas to the task of browsing.
Conference in Boston to show off the dazzling depth and multidisciplinary richness of Digital Government research community
DGRC Communications Manager
In its four short years, the Digital Government research program has grown to span myriad disciplines in its exploration of government IT problems:
Metadata, public policy, network architecture, wireless communications, biodiversity, geospatial information and all manner of data analysis and integration are represented at this year's National Conference on Digital Government Research.
Some of the nation's top IT researchers and their government partners are scheduled to present nearly 100 papers, demos and posters at dg.o2003 in Boston on May 19-21. The proceedings will be distributed at the conference, and posted on DigitalGovernment.org shortly afterward.
Until then, here are previews of just a few projects that demonstrate the breadth and depth of this fast-maturing community and the body of practical scientific knowledge it is building from theory and hard research:
Designing a Metadata-Driven Visual Information Browser for Federal Statistics
By Bill Kules and Ben Shneiderman
Human-Computer Interaction Laboratory, University of Maryland at College Park
With more than 70 federal agencies posting reports, charts, statistics and other information on the Web, the task of finding a particular nugget of information can be incredibly daunting, the researchers report. While public-domain search engines such as Google can help speed the process, the heterogeneity of formats, platforms and terminology among these agencies - each with its own information culture - can confound most searches.
The Maryland team has designed the FedStats Browser, a GUI-driven tool for navigating this potpourri ocean of data, which promises to let users narrow their queries swiftly to reach their targets. It will be presented Tuesday at 10 a.m. during the Metadata session.
The difference between this tool and Google is that it lets users see the information space where the data resides without first formulating a query, Kules and Shneiderman's paper reports.
"Part of the purpose of the browser was to show our agency partners a "what-if" scenario: What if we had a rich set of metadata that captures attributes of the documents that end-users find valuable," says Kules, referring to the bureaus of the census, labor statistics and other government agencies with whom his team collaborated.
"One of the tasks of our NSF project is to gain a more thorough understanding of end-user metadata needs, which can guide the selection of attributes to capture and use in an interface such as this," he says. "The browser has contributed to our dialog with agency partners in that area." The FedStats team is still in the process of gathering metadata to help speed the browsing process, but is considering using machine-learning techniques to automate classification of government data covered by its crawlers.
"We're working with agencies to identify ways of leveraging their survey-publication process to gather the required metadata with minimal additional effort on their production staff," Kules says. "But it's a substantial challenge, and there's a lot of discussion on how to best gather the data, both for our research purposes and for production use in a real agency."
Testbed for High-Speed End-to-End Communications in Support of Comprehensive Emergency Management
Charles bostian, Scott Midkiff, Tim Gallagher and Mary Miniuk
Center for Wireless Telecommunications, Virginia Tech
As one of several Digital Government projects focused on issues raised by the Sept. 11, 2001 terrorist attacks, this project is delving into methods of beating the limitations of the conventional telecommunications systems that often fail in the face of catastrophe.
Building upon an earlier-devloped geographic information system (GIS) tool, the Virginia Tech team has been exploring the limitations of conventional wireless technology in seeking to design an efficient, independent and rapidly-deployable wireless communications network.
"The hub unit consists of a radio, modem, and router with a connection to the internet," explains research engineer Tim Gallagher. "A notebook computer will be connected to the router and have video conference capabilities. The remote unit will consist of a radio, modem and router and act as an 802.11b access point. This allows the remote unit to support a wireless laptop with video conference capabilities and two PDAs. The broadband sounder for channel characterization is integrated into the radios and the GIS software is loaded on one of the laptops."
On Monday evening at 7 p.m., the researchers will be giving a system demonstration of their work so far, followed on Tuesday at 4 p.m. by a presentation in the Posters session. The demo will include running several applications over the network, including video conferencing, web browsing and instant messaging.
In the field, "the hub unit would be placed at the edge of the disaster/emergency area and be connected back to the internet either through surviving fiber or cable, or, in the absence of a physical internet connection, through a satellite link," Gallagher says. "The remote units would then be located within the emergency region and provide local wireless access to responders in the field. Initial deployment considerations would be based on GIS and channel sounder information. Once deployed, the system would be able to intelligently adapt its network configuration based on updated GIS and channel sounder information."
Further research must be done before the system is deployable in emergency situations, including performance evaluation, identifying the optimum channel on which to operate, and adapting the network configuration based on the channel's response. In the end, he says, the system must be integrated, easily to set up and robust enough to adapt to the environment.
Turning to Digital Government in a Crisis
Sharon S. Dawes, Bruce B. Cahan, Anthony M. Cresswell
Center for Technology in Government, University at Albany; Urban Logic, Inc.
Also triggered by the events of 9/11, a team of SUNY researchers partnered with New York nonprofit Urban Logic (which was involved in the response) to delve into the role played by government IT during the destruction of the World Trade Center towers and the aftermath.
"There was not just one kind of communication problem," report Dawes and Cresswell. "We learned about at least four:
- loss of circuits & infrastructure - damage at 140 West St./Verizon, and loss of cell towers;
- lack of information about various aspects of the situation, e.g., air quality, extent of damage to structures, identity/location of victims, etc., and
- huge increase in public demand and urgency for risk and emergency response/recovery information,
- loss of or barriers to access to information caused by the collapse of the towers and other buildings.
"For example, Kantor-Fitzgerald had off-site backups for their business data but everyone who knew the passwords to access the backups was killed. There was no centralized list of who worked in the towers or even what businesses were located there; NYS Dept of Labor had to construct a list from existing data bases. Even organizations not located in the Towers learned that they did not have complete and up to date contact information for their employees and so took a fair amount of time to mobilize.
"These are distinctly different problems that required different responses. For example, the NY State Police lost much of their NYSPIN network capacity, but patched messages around through ad hoc use of available voice and fax circuits until the network itself came back up.
"EPA had to invent technology to make huge quantities of air quality data available to the public by a new channel--a web site. In short, there was a double need that stretched resources: to both restore pre-existing communications and to establish new communication capacities needed for response and mitigation."
Studying data needs and resources, the use of IT, interagency behavior and response and the effect of rules and laws on the ability to respond during and after the disaster, the New York team's exploratory paper codifies some of the lessons that were learned and maps areas for future research with findings such as these:
- The Internet - particularly the Web and Internet telephony - worked when other networks failed
- Wireless computing power proved essential, though not widespread
- Communications nets thought to be redundant were in fact running on the same infrastructure
- GIS proved useful, but also emphasized the need for well-understood data management techniques and data quality control
- Public information mechanisms must be accurate, timely, authoritative, accessible and diverse
- Data coordination and integration problems surfaced quickly and continue to persist, such as multiple casualty lists that needed continual reconciliation
- Data issues (quality, access, use, sharing, security) far outweighed technology problems and were, and remain, harder to solve.
As for "best practices" indicated by the results of the study, Dawes and Cresswell say this: "The ability to identify and tap sources of replacement equipment and other resources was critical to re-establishing communications infrastructures.
"Some of this capacity was internal to large organizations, like Verizon, which has a huge resource base to draw from--size matters. Some is from working relationships among users and suppliers--e.g., the City and Verizon both leaned heavily on Nortel and Cisco to provide hardware.
"Contingency plans for replacing lost/damaged elements of critical infrastructure should be required. Increased redundancy in Internet-TCP/IP infrastructure will help prevent loss of web access as a critical communications link. "
The paper is due to be presented Wednesday at 8 a.m. in the session titled, "The Citizen & Government.
Social Welfare Program Administration and Evaluation and Policy Analysis Using Knowledge Discovery and Data Mining (KDD) on Administrative Data
Hye-Chung (Monica) Kum*, Dean Duncan**, Kimberly Flair**, Wei Wang*
University of North Carolina at Chapel Hill
* Department of Computer Science & Jordan Institute for Families, **Jordan Institute for Families, School of Social Work
One of several projects devoted to unifying the dizzying array of formats, platforms and methods of presenting easy access for massive federal datasets, the UNC team's work has focused on an integrated approach via knowledge discovery and data mining (KDD).
The toolsets they are experimenting with pose a tremendous potential benefit for social welfare program administration, program evaluation and policy analysis at a federal and state level.
The researchers collaborated with North Carolina Department of Health and Human Services on exploring ways to track and analyze patterns in the monthly data summaries generated by various welfare services, including Work First, Medicaid and Food Stamps. While the data generated by these programs are collated in a mainframe database management system meant for optimal operation and service provision, but not for information gathering or dissemination, the researchers were able to use KDD techniques to extract and re-present the data in a way that allowed for more easy administrative tracking and analysis of the trends taking place in the populations the agency serves, the paper reports.
For the portion of the project devoted to sequential analysis, the researchers focused on two questions, says lead researcher Monica Kum: "For sequential (ordered list of sets) analysis (the CS research part of our project) we focused on two research goals.
- How can we summarize/compress sequential data into something that can be viewed by people?
- Can we identify the major patterns underlying the sequential data as well as the common variations to the pattern?
"Although there is room for improvement, we were able to accomplish both to some satisfaction. We were also able to answer this question: How accurate was the resulting summarizations and the patterns from out algorithm?," she says. "We devised a solid evaluation method that showed that our appromixate method (ApproxMAP) we effective and efficient. Also, when used on real data, we could confirm that the much of the results were predictable (we saw the major patterns that we knew existed in the data from knowing the real policies and practices) and understandable (when we saw patterns that was not obvious from practice, we could understand how a group of sequences would form such a pattern)."
The research has tremendous potential benefit, Kum says:
"This technology provides the social workers with the macro view of the policies, the situations, and the impact. Before, being in the trenches, it was difficult for the social workers to see the forest. Now, the social workers can get the basic information about the program they supervise and how it impacts their community with a click of a mouse. They can understand the trends in North Carolina, as well as their county, and other similar counties. They can also track changes over time. I believe this can eventually have a great impact on their day to day decisions about individual clients."
The chief obstacle in putting this into practice will be the social workers' current workload, she notes: "No part of their job requires them to look at the site, and they are already swamped with too much work. So, although such wealth of information is there, due to lack of time, training, and habit, I think most social workers do not look at the information. More effort will be needed on this end for technology to really be transferred properly."
The North Carolina team is scheduled to present its findings at 10 a.m. Monday in the Data & Statistics session, and demonstrate its system during the 7 p.m. Monday demo session.
Regulatory Information Management and Compliance Assistance
Shawn Kerrigan, Charles Heenan, Haoyi Wang, Kincho H. Law and Gio Wiederhold
Expanding on its ongoing REGNET research into information management infrastructure, the Stanford team worked with state and federal environmental protection agencies to develop tools for organizing and interpreting environmental regulations.
The researchers' paper will described the methods of text mining, data extraction and language parsing they are using to build an XML framework for environmental laws that includes extracting, cleaning and defining more than 65,000 concepts in regulations found under Title 40 of the Code of Federal Regulation. The goal is to enable easier categorization and review of these often complex laws the paper on Stanford's REGNET project, which is due to be presented Tuesday at 1:30 in the "Information Management" session, and later at 4 p.m. during the system demonstrations.
The State Cancer Profiles Web Site and Extensions of Linked Micromap Plots and Conditioned Choropleth Map Plots
Daniel B. Carr, Sue Bell, Linda Pickle, Yuguang Zhang and Yaru Li
George Mason University (GMU), National Cancer Institute (NCI)
Launched in mid-April, the map-based data interpretation tools on this site are the culmination of four years of DG-funded research into the problem of rendering complex geospatially-indexed statistic sets browsable and searchable for planners, policymakers and epidemiologists.
The (NCI) co-developed the site with the Centers for Disease Control and Prevention (CDC) and partnered with the GMU team to research the best graphics-based methods for allowing data users to discern patterns that emerge at the intersection of cancer statistics, geospatial and socioeconomic data.
The Java-based shareware that generates the mapping systems has been offered for use by other agencies including the U.S. EPA, Bureau of Labor Statistics, Bureau of Transportation Statistics and Department of Agriculture, which are using it to study biodiversity, unemployment statistics, pedestrian fatalities and crop performance.
The GMU team is scheduled to show off the system at the Tuesday, 4 p.m. system demonstrations, and to present its findings on Wednesday at 10 a.m. during "Internet / Web Applications (II)."
Scientist, Politician and Bureaucrat Subcultures as Barriers to Information-Sharing in Government Agencies
David B. Drake, marianne J. Koch and Nicole A. Steckler
OGI School of Science and Engineering - Oregon Health & Science University
With government being the entrenched and slow-to-move mega-organization that it is, effecting change on an IT level has proved largely a series of pitched battles to change minds as well as systems.
Thus, research into organizational behavior and management hs become one of the Digital Government community's fastest-growing subdisciplines. Researchers at the Oregon Health & Science University have been studying the effect of cultural differences on information-sharing practices which, ultimately, affect the ability of CIOs to enact e-government solutions.
Their paper identifies these organizational subcultures:
The bureaucrat subculture - the day-to-day decision-makers who build and maintain public services and try to remain non-partisan while enforcing politically-influenced laws and regulations;
The politician subculture - the appointed leaders who must balance all the influences at work in directives from three branches of government, manage on behalf of various interests and the public good and lead their agencies and manage public resources by advocacy, negotiation and compromise; and
The scientist subculture - the science and technical specialists who build upon existing knowledge and seek to solve societal problems under the scientific method of selecting, collecting and analyzing data.
The widely diverging interests, philosophy, world view and practices of these subcultures can affect the way they use, value and, ultimately, share information, report the Oregon researchers, whose paper identifies these four main "areas of challenge and opportunity" in analyzing the way information is shared and decisions made:
- Data is political
- Information is structural
- Knowledge is cultural
- Wisdom is relational
An in-depth presentation on these findings is scheduled for Monday at 1:30 p.m. in the session titled Data Sharing and Integration (I).
Finally, the body of knowledge generated by digital government research has grown so diverse and complex that it requires its own search engine. A University of Arizona project is devoted to building just that:
DGPort: A Web Portal for Digital Government
Chun Q. Lin, L. Dwayne Nickels, Charles Zhi-kai Chen, T. Gavin Ng and Hsinchun Chen
Artificial Intelligence Lab, University of Arizona
As the digital government community has grown, so has its hunger for - and production of - information related to government IT issues.
On a regular basis, scientific papers, Web pages, journals, technical reports and other data are being churned out and sought out by people studying and engaged in DG research.
Using conventional search techniques, meta-searching and artificial intelligence technology such as automatic summarization, the Arizona researchers plan to develop a focused Web portal for digital government information, says their paper, due to be presented at 8 a.m. Wednesday during the session "Internet / Web Applications (I)."
"In the long term, the potential audience for DG Port is enormous: virtually anyone from any community with an information need that can be fulfilled through government-published information is a potential user of DGPort," says Lin. "Of course, DGPort encompasses other information sources as well, such as news sources, which broadens the potential audience even more."
The next phase of research will be to test the DGPort technology with an audience of digital government researchers to evaluate its efficiency, followed by refinement and a release to a broader audience, Lin says.
For more information on current Digital Government Research, please subscribe to dgOnline and read past issues.