It can be very difficult to create a title that is attractive and descriptive of a written artifact using only a few words. The purpose of this project was to create an automated text titling program that could assist authors during the process of choosing a title for their work. The system was designed to generate coherent and sensible titles for formal research papers as well as lay scientific publications. I completed this class project with two other students.
– Create an automated system to support authors title their works
– Measure how well the program performs
– User Researcher
Title generation can be considered a form of categorization. When creating a title, an author aims to identify and convey key points from the text. The titling system, named Sci-Ti (pronounced like “Sci-Fi”), processes multiple percepts in the form of sentences and words, maps them to key points, and produces titles that are relevant to those key points. See Figure 1 for a representation of this process.
Figure 1. Visual Representation of Sci-Ti’s Procedure
Sci-Ti’s title generation process can be broken down into three general phases:
2. Word weighting
3. Title generation
During the preprocessing stage, Sci-Ti accepts a text as a file and parses it for words. Sci-Ti performs multiple calculations pertaining to each word’s proximity in the text and its frequency to generate an overall word weight. The system uses that word weight along with each word’s part of speech to generate 10 possible titles for a text. The exact coding strategies and algorithms used during each step will not be detailed in this project summary.
Evaluation of System
Sci-Ti was designed with a minimalist user interface. A user can launch the system using the command line and then interact with a basic GUI (See Figure 2). The evaluation for this project was focused on the system’s output capability, not its usability.
Although there are no well known title validity standards, it is still possible to judge Sci-Ti’s performance based on the goal of the project. As previously stated, Sci-Ti’s purpose is to systematically extract keywords from a text to generate descriptive titles that are coherent and sensible to human readers.
To test the different components of the claim, I extrapolated characteristics that could be judged by humans. I designed a survey that used Likert scales to gather participants’ judgements about different text titles. Questions included scales for relevance to measure the “descriptive” claim, while additional scales for coherence and attractiveness assessed the “coherent and sensible” claim. Based on those evaluation standards, if Sci-Ti generated titles were judged to fulfill those characteristics, then the claim is supported.
To collect data a survey was distributed to students and a total of 17 responses were gathered. The survey was formatted to be self-explanatory and require no moderator. The first section of the survey contained the following brief explanation: “The following questionnaire will be used to evaluate the effectiveness of certain titles for specific texts. The questionnaire should take less than 10 minutes in total.” This introductory text was purposely vague about the source of the titles to prevent bias in participant responses.
All parts of the survey were formatted in the same fashion: a passage was presented along with its associated title followed by a series of three Likert scale questions. See Figure 3 for the format of the questions.
The survey consisted of six lay passages and titles. Texts were taken from short (100-350 word) scientific articles. The questions were presented in a consistent order for each participant. Two of the passages were shown with their original human-generated titles (h), two articles were shown with randomly generated titles created by Sci-Ti when word weights were ignored (r), and two passages with titles created by Sci-Ti when it used word weights (s). These were ordered as follows: h, s, h, r, s, r.
Sci-Ti’s titles were evaluated similarly to the human-generated titles for relevance. The two human titles received mean relevance ratings of 4.3529 and 3.671, while Sci-Ti’s titles were rated at 4.0588 and 3.3529. Titles with mean scores above 3 on the 5-point Likert scale suggest evaluators found those titles relevant (as a score of 3 represents “neither relevant nor irrelevant” on the scale). Therefore, both the human and Sci-Ti titles can be said to have been found relevant by the human evaluators (see Figure 4 for a graphical representation of results). This suggests Sci-Ti adequately creates relevant titles.
Figure 4. Relevance Scores
Sci-Ti’s coherence rating was higher than that of the randomly generated titles’, but not as high as the human created titles. Many evaluators gave Sci-Ti’s titles attractiveness scores of 3 or 4 whereas other types of titles often had more variable results. Overall, attraction was the least conclusive of the three aspects measured because there were large standard deviations for this scale’s responses.
Based on the results of the user study, the quality of Sci-Ti’s outputs is inconclusive. Future studies with revised evaluation criteria, larger sample sizes, and improved survey methodology could help provide more conclusive data.
The summarization of texts is not a simple process for humans, but Sci-Ti is capable of reproducing part of that procedure using a technique unlike that of humans. Sci-Ti is in its early stages of development, but could be refined to improve its creativity, title generating capabilities and usability.
Video Overview of Project
Project on Github: Click here to view the Github repository