Skip to content

The answer given by pyke is incorrect despite that the formation is correct. #2

@Shujie24

Description

@Shujie24

This is a similar problem to another issue:#1.
I encountered this issue when solving ProntoQA using pyke. One typical example is the following. The problem is ProntoQA_10:

{
    "id": "ProntoQA_10",
    "context": "Every impus is earthy. Each impus is a jompus. Jompuses are small. Jompuses are rompuses. Rompuses are not amenable. Rompuses are wumpuses. Wumpuses are wooden. Wumpuses are zumpuses. Every zumpus is temperate. Every zumpus is a dumpus. Dumpuses are dull. Dumpuses are vumpuses. Every vumpus is not shy. Every yumpus is sweet. Vumpuses are numpuses. Numpuses are not sweet. Numpuses are tumpuses. Fae is a wumpus.",
    "question": "Is the following statement true or false? Fae is sweet.",
    "options": [
      "A) True",
      "B) False"
    ],
    "answer": "B",
    "explanation": [
      "Fae is a wumpus.",
      "Wumpuses are zumpuses.",
      "Fae is a zumpus.",
      "Every zumpus is a dumpus.",
      "Fae is a dumpus.",
      "Dumpuses are vumpuses.",
      "Fae is a vumpus.",
      "Vumpuses are numpuses.",
      "Fae is a numpus.",
      "Numpuses are not sweet.",
      "Fae is not sweet."
    ]
  }

And the formation from natural language to program is also correct:

fact1
	foreach
		facts.Impus($x, True)
	assert
		facts.Earthy($x, True)

fact2
	foreach
		facts.Impus($x, True)
	assert
		facts.Jompus($x, True)

fact3
	foreach
		facts.Jompus($x, True)
	assert
		facts.Small($x, True)

fact4
	foreach
		facts.Jompus($x, True)
	assert
		facts.Rompus($x, True)

fact5
	foreach
		facts.Rompus($x, True)
	assert
		facts.Amenable($x, False)

fact6
	foreach
		facts.Rompus($x, True)
	assert
		facts.Wumpus($x, True)

fact7
	foreach
		facts.Wumpus($x, True)
	assert
		facts.Wooden($x, True)

fact8
	foreach
		facts.Wumpus($x, True)
	assert
		facts.Zumpus($x, True)

fact9
	foreach
		facts.Zumpus($x, True)
	assert
		facts.Temperate($x, True)

fact10
	foreach
		facts.Zumpus($x, True)
	assert
		facts.Dumpus($x, True)

fact11
	foreach
		facts.Dumpus($x, True)
	assert
		facts.Dull($x, True)

fact12
	foreach
		facts.Dumpus($x, True)
	assert
		facts.Vumpus($x, True)

fact13
	foreach
		facts.Vumpus($x, True)
	assert
		facts.Shy($x, False)

fact14
	foreach
		facts.Yumpus($x, True)
	assert
		facts.Sweet($x, True)

fact15
	foreach
		facts.Vumpus($x, True)
	assert
		facts.Numpus($x, True)

fact16
	foreach
		facts.Numpus($x, True)
	assert
		facts.Sweet($x, False)

fact17
	foreach
		facts.Numpus($x, True)
	assert
		facts.Tumpus($x, True)

But after giving these to pyke, the output prediction is A rather than the correct answer B.
Are there some problems with pyke? Is pyke giving the correct answer?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions