Skip to content

11.5.1 Seq2Seq Roadmap: Input Sequence to Output Sequence

Seq2Seq handles tasks where both input and output are sequences: translation, summarization, rewriting, dialogue, and error correction.

Seq2Seq and Attention chapter learning order diagram

Seq2Seq encoder decoder bottleneck map

T5 text-to-text task unification map

The bridge to modern LLMs is clear: generation happens step by step, and attention helps the decoder look back at useful input positions.

source = ["I", "love", "NLP"]
target = ["J'aime", "le", "NLP"]
for step, token in enumerate(target, start=1):
print(f"decode_step_{step}:", token)
print("source_length:", len(source))
print("target_length:", len(target))

Expected output:

Terminal window
decode_step_1: J'aime
decode_step_2: le
decode_step_3: NLP
source_length: 3
target_length: 3

Generation projects should record decoding strategy, failure cases, and whether important input information was lost.

StepReadPractice Output
1Encoder-DecoderExplain why input and output can have different lengths
2AttentionExplain dynamic alignment during generation
3Machine translationConnect teacher forcing, decoding, BLEU/error analysis
4CTC and speechSee what changes when input/output are not frame-aligned

Keep this page’s proof of learning as a small evidence card:

Source Target
source text, target text, and task type
Decoded Output
generated summary, translation, transcript, or sequence result
Alignment Note
attention, CTC path, coverage, or copied source evidence
Failure Check
omission, repetition, hallucination, wrong alignment, or weak evaluation
Expected Output
generated text with factual or alignment review notes

You pass this chapter when you can explain encoder-decoder, attention, greedy/beam decoding, and one generation failure.

Check reasoning and explanation
  1. A passing answer starts from the text unit and output type: token, span, sentence label, sequence, embedding, or generated text.
  2. The evidence should include a small dataset example, model or pipeline choice, metric, and at least one inspected error case.
  3. A good self-check distinguishes preprocessing issues from model issues, such as tokenization mistakes, label ambiguity, data imbalance, or hallucinated generation.