The Format — What Happens in 11 to 14 Minutes
The IELTS Speaking test has three parts, each with a distinct purpose and a different type of interaction. The test is conducted by a trained examiner in person and is recorded. The same format applies to both Academic and General Training candidates.
Part 1 lasts four to five minutes. The examiner asks questions on familiar topics: home, family, work or study, hobbies, daily routine. Answers are expected to be short to medium length. The purpose is to establish fluency at a comfortable level before the more demanding parts.
Part 2 lasts three to four minutes. The examiner hands the candidate a cue card with a topic and three or four prompts. One minute is given for preparation; the candidate then speaks for one to two minutes. The examiner does not interrupt during the long turn. A follow-up question or two may follow before moving to Part 3.
Part 3 lasts four to five minutes. The examiner asks more abstract, discursive questions linked to the Part 2 topic. These require the candidate to express and justify opinions, compare perspectives, and speculate. This is the part most closely aligned with the type of thinking required in the Speaking criteria at Band 7 and above.
The Four Criteria — Plain Language
Speaking is scored against four criteria, each worth 25% of the Speaking band score. The official descriptors are published by the IELTS Partners at ielts.org.
Part 1 — What Goes Wrong
Examiner: "Do you enjoy cooking?" Candidate: "Yes, I do." The test is assessing speaking ability. Monosyllabic answers give the examiner nothing to score on Fluency, Lexical Resource, or Grammar. Part 1 expects two to four sentences per answer.
"That's a great question" and "I'm glad you asked me that" are not natural responses in English-speaking contexts. Examiners note them as rehearsed. They do not add to any scoring criterion and can signal that the candidate is switching into a prepared mode.
Give a direct answer, add a brief reason or example, and optionally extend with a contrast or personal detail. Three moves, two to four sentences total. Practice the structure, not individual answers.
Part 2 — What Goes Wrong
One minute for preparation is enough to note one idea per prompt, not to write full sentences. Candidates who try to script the answer in preparation time often run out of time to plan, then read from the notes haltingly — damaging Fluency more than the lack of notes would have.
The long turn is designed for sustained speech. Stopping at 45 seconds provides insufficient evidence for the examiner to assess Fluency and Coherence at the higher bands. If the card's direct prompts are exhausted, extend by describing how the topic connects to a related idea, or by comparing it to something contrasting.
The cue card specifies the topic and what aspects to cover. Candidates who find the topic unfamiliar and drift to a more comfortable subject are not meeting the task, which affects Fluency and Coherence scoring.
Spend 15 seconds reading the card fully. Then write one key word or short phrase next to each of the three or four sub-prompts. Use the remaining time to think of one specific example or memory to use. Notes are allowed but should be a scaffold, not a script.
Part 3 — What Goes Wrong
Part 3 asks for abstract thinking: "Do you think technology has changed the way people communicate?" The expected response discusses trends, reasons, comparisons, and implications. A story about the candidate's own use of technology answers a different question.
"Yes, I think so." Part 3 is the examiner's best window into the upper grammar and vocabulary bands. Short answers cut off that window. Examiners often prompt with "Why do you think that?" — but candidates who consistently require prompting score lower on Fluency and Coherence than those who extend naturally.
Examiners are trained to recognise when a candidate is delivering a prepared speech rather than thinking in real time. If a Part 3 topic is the same as a topic the candidate rehearsed, the answer may sound fluent but the register, the linking, and the vocabulary choices often reveal the rehearsal. Examiners can redirect the question to a related angle — a candidate delivering memorised content cannot follow the redirect smoothly.
State a clear position. Give a reason. Add an example or qualifying condition. Acknowledge a counter-position briefly if the question invites it. This is essentially Part 3 spoken in four moves — the same move pattern that works for a body paragraph in Writing Task 2.
Pronunciation — What Is and Is Not Penalised
Pronunciation at Band 7 does not require a British or American accent. IELTS assesses whether pronunciation features — word stress, sentence stress, connected speech, intonation — are used in a way that makes the speaker easy to understand and natural to listen to. A consistent Bangladeshi or South Asian accent does not reduce a candidate's Pronunciation band, provided it does not cause misunderstanding.
What is assessed: whether the candidate uses stress on the correct syllables of words, whether sentence rhythm creates natural groupings of meaning, whether consonant and vowel sounds are produced consistently enough that the listener can follow without effort. Occasional lapses are permitted at Band 7.
The specific features that most commonly affect the Pronunciation band for Bengali-speaking candidates include final consonant clusters (the /st/, /nd/, /ld/ clusters that Bengali phonology handles differently), the distinction between /v/ and /b/ in initial position, and sentence-level stress patterns that place emphasis on function words rather than content words.
Intelligibility is the criterion. An accent that the listener can follow without difficulty does not cost marks. The distinction matters for preparation: practicing to sound like a BBC presenter is not the goal. Practicing to produce consonant clusters, word stress, and sentence rhythm accurately is.
The Memorised-Answer Trap
The most common preparation mistake across all three parts is building a bank of rehearsed answers to common questions. The logic is understandable — more time with prepared material means the material feels fluent. The problem is that the test does not always ask those exact questions, and examiners who hear the same prepared introduction to "describe a person you admire" or "talk about a place you have visited" across hundreds of tests can identify the register shift immediately.
More practically: rehearsed answers do not flex. If the examiner's Part 3 question takes an unexpected angle, a candidate with only rehearsed content cannot respond authentically. This is where Fluency and Coherence breaks down — not because the candidate lacks English, but because the test is no longer moving through the prepared territory.
The alternative is to practise the structure of responses, not specific answers: how to open a Part 1 answer naturally, how to structure a two-minute Part 2 turn, how to extend a Part 3 opinion with a reason and an example. The structure transfers to any question; rehearsed content does not.
For criterion-by-criterion feedback on Speaking practice, including Part 2 timing and Part 3 extension, see the IELTS tutoring service. The Writing Task 2 guide covers the parallel structure behind well-developed arguments.
Sajjadur Rahman
IELTS Tutor · University of DhakaOffers IELTS Speaking preparation including mock interviews with criterion-specific feedback, Part 2 timing practice, and Part 3 extension strategies for candidates targeting Band 7 and above.