我试图将一份载有问题的理论文件变成一个字典。
问题有:
- Question
a. first answer
b. second answer
c. third answer
d. fourth answer
e. fifth answer
In the file, the correct answer is the bold one, in this case the third. The word file is built with MS Word bullet points (1. and so on for questions, and a. and so on for answers).
由此形成的独裁者应当:
{
1 : {
question : the question text ,
answer : [ first answer , second answer , third answer , fourth answer , fifth answer ],
correct_answer : 2
},
Other questions...
}
我研究了这一法典:
from docx import *
def is_bold(run):
return run.bold
# Open the document
doc = Document( sample.docx )
# Create an empty dictionary for questions and answers
questions_and_answers = {}
# Iterate only through paragraphs
for paragraph in doc.paragraphs:
text = paragraph.text.strip()
# Check if the paragraph starts with a number and a dot
if text and text[0].isdigit() and text[1] == . :
question_number, question = text.split( , 1)
answer_choices = []
correct_answer_index = None
# Continue to the next paragraph that will contain the answers
next_paragraph = paragraph
while True:
next_paragraph = next_paragraph.next_paragraph
# If there are no more paragraphs or it starts with a number, we ve reached the end of the answers
if not next_paragraph or (next_paragraph.text.strip() and next_paragraph.text.strip()[0].isdigit()):
break
next_text = next_paragraph.text.strip()
# If it starts with a letter and a period, consider it as an answer
if next_text and next_text[0].isalpha() and next_text[1] == . :
answer_run = next_paragraph.runs[0] # Consider only the first "run" to check the style
answer_text = next_text[3:] # Remove the answer format (a., b., c., ...)
answer_choices.append(answer_text)
# Check if the answer is bold (hence, correct)
if is_bold(answer_run):
correct_answer_index = len(answer_choices) - 1 # Save the index of the correct answer
# Add the question and answers to the dictionary
questions_and_answers[question_number] = {
question : question,
answers : answer_choices,
correct_answer_index : correct_answer_index
}
# Print the resulting dictionary
for number, data in questions_and_answers.items():
print(f"{number}: {data[ question ]}")
print("Answers:")
for answer in data[ answers ]:
print(f"- {answer}")
print(f"Index of the correct answer: {data[ correct_answer_index ]}")
print()
不幸的是,我getting着空洞的dict子。