generative-ai-newsroom.com
Open in
urlscan Pro
162.159.153.4
Public Scan
Submitted URL: https://generative-ai-newsroom.com/how-to-use-gpt-4-to-summarize-documents-for-your-audience-18ecfe2ad6a4
Effective URL: https://generative-ai-newsroom.com/how-to-use-gpt-4-to-summarize-documents-for-your-audience-18ecfe2ad6a4?gi=8c66dcaa6c69
Submission: On July 06 via api from US — Scanned from DE
Effective URL: https://generative-ai-newsroom.com/how-to-use-gpt-4-to-summarize-documents-for-your-audience-18ecfe2ad6a4?gi=8c66dcaa6c69
Submission: On July 06 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
Open in app Sign up Sign In Write Sign up Sign In Support independent authors and access the best of Medium. Become a member Become a member Top highlight HOW TO USE GPT-4 TO SUMMARIZE DOCUMENTS FOR YOUR AUDIENCE Nick Diakopoulos · Follow Published in Generative AI in the Newsroom · 6 min read · Apr 11 53 1 Listen Share Editor’s Note: This post has been edited to include an addendum about corrections made to one of the articles referenced. There’s a lot happening in the field of generative AI. You could easily burn out trying to stay on top of it all. That’s why I wanted to see if I could use GPT-4 to help summarize some of the latest research and provide tailored summaries for the audience here. Today I published a couple examples of this which you can read here and here. To generate these articles I used a series of prompts to OpenAI’s GPT-4 model to (1) analyze the research papers and extract particular pieces of information, and then (2) write a summary based on those bits of extracted information. By breaking this process down into two steps I was able to better control the information that would be included in the final summary articles. To define what I wanted the model to analyze in step 1 I first thought about the audience for this blog: journalists such as reporters and editors who might want to know what a new piece of research means for their practice and whether there are any limitations that would curtail its value or utility. I then experimented with some prompts and settled on three to extract information from the document text [1,2,3]. Each of these papers is short enough that I could include the entire text into GPT-4. I fed the output from these three prompts into a final prompt to generate the first article [4]. I tweaked the prompt for the second article to aim for a more accessible blog-like style [5]. To see how I configured all the prompts including the system prompt, the document prompts, and other model parameters see the code in this Colab Notebook. ACCURACY CHECKING One of the biggest concerns about generative AI summaries is the potential for fabrication of information. An article about a research paper needs to ensure that facts are accurate and consistent with respect to that paper. For the first paper, I read it thoroughly before I tried to summarize it. This allowed me to quickly assess whether the generated summary was accurate with respect to the paper. While I didn’t see any outright fabrication, one of the generations included a sentence that didn’t make sense to me and another sentence was a bit confusing or potentially misleading. This reaffirmed that you really do need to have a human in the loop. Having read the paper, which took me about 30 minutes, it only took about 5 minutes to read and assess the final output. For the second paper, I summarized it automatically without reading it. But this time I spent more time factchecking the output, reading each sentence and checking whether it was consistent with the underlying paper. I added one snippet of text (“e.g. radio, speech, TV, etc.”) to help address what I thought was an issue of specificity in the writing. This editing process took about 10 minutes, overall less time than the first paper since I only read small excerpts of the paper as I was editing. To create illustrations for the final articles I followed some of the advice here and manually prompted DALL-E until it generated some images that I thought were reasonable. That took perhaps another 10 minutes for each article. So in total, the first article took me 45 minutes, and the second one took me about 20 minutes. In both cases it took about 5 minutes for the model to extract information and then output an article. Further Reflections I think GPT-4 is a viable technology for accelerating the translational coverage of research for particular audiences. It drastically reduced the time and effort needed to produce a tailored summary to about 20 minutes. This could be combined with a prior use-case I’ve developed on story discovery to filter new research articles of relevance to an audience and then automatically generate blog posts summarizing those papers. The key way that journalists can differentiate summary articles is by focusing on extracting different pieces of information in the first step. In my case I focused on key findings of interest to the target audience (reporters and editors), including benefits, limitations, and critiques. But others could configure their own questions and frames of interest to different audiences and arrive at different outputs. Basically it’s up to the journalist to define what matters and use that to drive the summary. The other key area where journalists still need to be involved is in editing and factchecking the resulting articles, including by checking for any sentences that are too similar to sentences in the original research paper and might need to be quoted in order to avoid potential plagiarism. I have a nagging feeling that even though I configured the AI to be critical that it probably has some blind spots and it could miss something. In an ideal world, the outputs of this process would not only get edited for accuracy but serve more as a first draft for a reporter to write-through, or even just to provide an impetus to go do more reporting. I’ll also admit that the output articles are perhaps not the most interestingly written. A good writer could make them more engaging. Perhaps you could include quotes, excerpts, or more examples (saliency), or explore how the findings might actually be used in a specific news task (concreteness). A final limitation here is that there is often visual information in research papers that isn’t currently being considered in the process. Modern storytelling is about more than text and at least for now there’s a need for a person to help illustrate and think about whether there are data or figures needed to convey the findings. Addendum: After publishing the articles a reader pointed out that one of them contained a few sentences that were similar enough to the underlying research paper that you would want to include quotation marks in order to avoid any claims of plagiarism. That article has now been corrected and an editor’s note included. The other article was also checked but no issues were found. This check was done manually but using an automated script to list any sentences that were above a threshold of similarity to the original document. — [1] What research question is the paper trying to answer? Explain what the researchers did to study that question, including the specific methods used and analyses performed. Explain thoroughly and be sure to include specific details. [2] What are the key findings reported in the paper that are important for journalists such as reporters and editors? How might journalists such as reporters and editors benefit from these findings? Why might there still be limits to those benefits? Explain thoroughly and be sure to include specific details. [3] Critique the findings of the paper, focusing on their validity and utility for journalists such as reporters and editors. Are there reasons not to trust any of the findings? Explain thoroughly and be sure to include specific details. [4] Here are some important observations about that research paper: <extracted information>. Write a 600 word article about the paper using only the paper text and the important observations about the paper above and focusing on the benefits and limitations of the findings for journalists. Reduce scientific jargon and technical terminology in the writing. [5] Here are some important observations about that research paper: <extracted information>. Write a 600 word article about the paper in the style of an online blogger, using only the paper text and the important observations about the paper above and focusing on the benefits and limitations of the findings for journalists. Reduce scientific jargon and technical terminology in the writing so that it is accessible to a broad audience. Generative Ai News 53 53 1 Follow WRITTEN BY NICK DIAKOPOULOS 1.8K Followers ·Editor for Generative AI in the Newsroom Northwestern University Professor of Communication. Computational journalism, algorithmic accountability, social computing — http://www.nickdiakopoulos.com/ Follow MORE FROM NICK DIAKOPOULOS AND GENERATIVE AI IN THE NEWSROOM Nick Diakopoulos in Generative AI in the Newsroom THE STATE OF AI IN MEDIA: FROM HYPE TO REALITY EXPLORES AI’S PERVASIVENESS AND HYPE IN THE NEWS MEDIA, EXPLORING SOME OF THE DRIVING FACTORS AND IMPLICATIONS. 9 min read·May 9 5 Alessandro Alviani in Generative AI in the Newsroom TOWARDS ACCURATE QUOTE-AWARE SUMMARIZATION OF NEWS USING GENERATIVE AI HOW TO PROMPT A LANGUAGE MODEL TO REWRITE OR SUMMARIZE A NEWS ARTICLE WHILE MAINTAINING ACCURATE QUOTES. 8 min read·Jun 2 71 1 Nikita Roy in Generative AI in the Newsroom BUILDING A GPT-4 POWERED GOOGLE DOCS EXTENSION FOR NEWS QUIZ GENERATION AN EXPERIMENT HIGHLIGHTING THE SIGNIFICANCE OF A SYSTEMATIC APPROACH TO PROMPT ENGINEERING IN ACHIEVING HIGH-QUALITY OUTPUT FROM LLMS. 7 min read·Jun 8 1 Nick Diakopoulos in Generative AI in the Newsroom FINDING NEWSWORTHY DOCUMENTS USING GENERATIVE AI WHAT IF AI COULD SCAN THE WORLD FOR EVENTS AND INFORMATION AND SEND AN ALERT WHEN SOMETHING LOOKED INTERESTING? 7 min read·Mar 6 89 1 See all from Nick Diakopoulos See all from Generative AI in the Newsroom RECOMMENDED FROM MEDIUM Vedic Science in data-driven fiction SECRET PROMPT THAT CHATGPT LOVES, WITH PROOFS SECRET TO GETTING GREAT RESULTS THAT ONLY 1% OR LESS KNOW ·5 min read·Jan 26 2.4K 30 Jay Peterman in Towards Data Science MAKE A TEXT SUMMARIZER WITH GPT-3 QUICK TUTORIAL USING PYTHON, OPENAI’S GPT-3, AND STREAMLIT ·11 min read·Jan 23 169 1 LISTS AI REGULATION 6 stories·16 saves APPLE'S VISION PRO 7 stories·5 saves GENERATIVE AI RECOMMENDED READING 52 stories·46 saves WHAT IS CHATGPT? 9 stories·122 saves Edmond Yip in Bootcamp VIDEO VERSION OF MIDJOURNEY EVOLVES AGAIN: GENERATE VIDEOS WITH A SINGLE SENTENCE WHEN IT COMES TO LARGE-SCALE AI MODELS IN THE GENERATIVE FIELD, WE HAVE CHATGPT FOR TEXT, MIDJOURNEY AND STABLE DIFFUSION FOR IMAGES, BUT… ·7 min read·Jun 20 12 The Jasper Whisperer CHATGPT VS JASPER CHAT: WHICH AI CHATBOT IS RIGHT FOR YOUR BUSINESS? A COMPARISON OF CHATGPT AND JASPER CHAT ·4 min read·Jan 10 462 2 The Jasper Whisperer WHAT’S THE DIFFERENCE BETWEEN GENERATIVE AI CHATBOTS AND AGI (ARTIFICIAL GENERAL INTELLIGENCE)? NAVIGATING THE AI LANDSCAPE FOR BUSINESS AND STUDENTS ·4 min read·Feb 12 260 Maeda Hanafi in Towards Data Science TEXT PATTERN EXTRACTION: COMPARING GPT-3 & HUMAN-IN-THE-LOOP TOOL PRELIMINARY EXPERIMENTS AND RESULTS FROM COMPARING LLMS AND HUMAN-IN-THE-LOOP TOOLS FOR TEXT PATTERN EXTRACTION ·10 min read·Jan 26 107 3 See more recommendations Help Status Writers Blog Careers Privacy Terms About Text to speech Teams To make Medium work, we log user data. By using Medium, you agree to our Privacy Policy, including cookie policy.