What is More Likely to Happen Next


What did authors try to accomplish?

In this paper, the authors proposed a new task and a new dataset for future event prediction task. The authors have used a different paradigm to collect data from MTurk. It has helped them to generate easy, medium and hard data, without any negation or usual bias from the MTurk annotators.

What are the key elements of the approach?

The main key element of this paper is the data collection method rather than the future event prediction method. Their data collection procedure includes:

Standard Data Collection: In this stage, authors have collected 50% of the data just giving the video, premise summary to the annotators and ask the annotators to write (1) most like future event (2) less likely future event. This data is annotated as the easy data.

Adversarial Data Collection: In this stage, authors have trained a classifier based on the previous stage data to classify most likely and less likely events. Now when the annotator, write the most likely and less likely, both of them goes to the classifier and if the probability difference between them is too much, then that data is discard. The hypothesis is that if the probability of both of them is closer, then it is hard to differentiate for a model from most likely to less likely. This data is considered as the hard data.

Adversarial Matching:
Now as the authors had enough data, authors have generated synthetical generation but with adversarial matching. For that, for each most likely event, the authors try to find a negative from other negative such that they are similar to the premise but not too similar to the positive. The former one makes it harder for a model to differentiate and the latter makes sure that it is not the positive event somehow.

What can I use for myself?

The data collection idea is a very interesting and clever one which we can use if we need to generate synthetical or human dataset. Other than that, my main motivation was to find a dataset to train a model for future event prediction which might help me in my current work. This dataset in paper actually solves that issue. The only issue is when I looked through their dataset, their was no premise, there was a video, positive and negative event but as described on paper, there was no premise summary.

Still trying to find more like this type of work.

3+ Most Important Things

  1. Data collection method
  2. VLEP dataset

1+ Deficiencies

  1. I didn't find any deficiencies but currently this dataset should be easily solvable by the LLMs



