Spell Checking
Consulo provides built-in spell checking support that custom language plugins can integrate with.
By providing a SpellcheckingStrategy (consulo.language.spellcheker.SpellcheckingStrategy), a plugin controls which PSI elements in its language are subject to spell checking and how their text is tokenized for the spell checker.
SpellcheckingStrategy
SpellcheckingStrategy is an abstract class annotated with @ExtensionAPI(ComponentScope.APPLICATION) and implements LanguageExtension.
It is responsible for mapping PSI elements to appropriate Tokenizer instances that break the element text into words for spell checking.
Key Methods
-
getTokenizer(PsiElement element)-- Returns theTokenizerto use for the given PSI element. The default implementation provides the following behavior:PsiWhiteSpaceelements returnEMPTY_TOKENIZER(no spell checking).PsiLanguageInjectionHostelements with injected PSI files returnEMPTY_TOKENIZER.PsiNameIdentifierOwnerelements returnmyNameIdentifierOwnerTokenizer(aPsiIdentifierOwnerTokenizerinstance).PsiCommentelements returnmyCommentTokenizer(aCommentTokenizerinstance), unless the comment is a suppression comment or a shebang line at offset 0.PsiPlainTextelements returnTEXT_TOKENIZER.- All other elements return
EMPTY_TOKENIZER.
-
isMyContext(PsiElement element)-- Returnstrueif this strategy applies to the given element. The default implementation returnstruefor all elements. Override this method to limit spell checking to certain contexts within your language.
Built-in Tokenizers
SpellcheckingStrategy provides several ready-to-use tokenizer instances:
| Field | Type | Description |
|---|---|---|
EMPTY_TOKENIZER |
Tokenizer |
A no-op tokenizer that produces no tokens. Use this to skip spell checking for an element. |
TEXT_TOKENIZER |
Tokenizer<PsiElement> |
A TokenizerBase using PlainTextTokenSplitter. Suitable for plain text content. |
myCommentTokenizer |
Tokenizer<PsiComment> |
A CommentTokenizer that feeds comment text through CommentTokenSplitter. |
myNameIdentifierOwnerTokenizer |
Tokenizer<PsiNameIdentifierOwner> |
A PsiIdentifierOwnerTokenizer that extracts the name identifier and feeds it through IdentifierTokenSplitter. |
The Tokenizer Class
The Tokenizer<T> (consulo.language.spellcheker.tokenizer.Tokenizer) abstract class defines how a PSI element's text is broken into tokens for spell checking:
tokenize(T element, TokenConsumer consumer)-- Breaks the element text into tokens and passes them to theTokenConsumer. Annotated with@RequiredReadAction.getHighlightingRange(PsiElement element, int offset, TextRange textRange)-- Returns the text range to highlight when a misspelling is found.
The TokenConsumer (consulo.language.spellcheker.tokenizer.TokenConsumer) abstract class receives tokens from the tokenizer:
consumeToken(PsiElement element, TokenSplitter tokenSplitter)-- Consumes a token using the given splitter.consumeToken(PsiElement element, boolean useRename, TokenSplitter tokenSplitter)-- Consumes a token with an option to use rename-based correction.consumeToken(PsiElement element, String text, boolean useRename, int offset, TextRange rangeToCheck, TokenSplitter tokenSplitter)-- The most detailed variant, specifying exact text, offset, and range.
Retrieving Strategies
You can retrieve all registered SpellcheckingStrategy instances for a given language using the static method:
List<SpellcheckingStrategy> strategies = SpellcheckingStrategy.forLanguage(myLanguage);
Registration
To register a spell checking strategy for your custom language, create a class that extends SpellcheckingStrategy and annotate it with @ExtensionImpl.
Implement the getLanguage() method from LanguageExtension to indicate which language this strategy applies to.
import consulo.annotation.component.ExtensionImpl;
import consulo.language.Language;
import consulo.language.psi.PsiElement;
import consulo.language.spellcheker.SpellcheckingStrategy;
import consulo.language.spellcheker.tokenizer.Tokenizer;
import jakarta.annotation.Nonnull;
@ExtensionImpl
public class MyLanguageSpellcheckingStrategy extends SpellcheckingStrategy {
@Nonnull
@Override
public Language getLanguage() {
return MyLanguage.INSTANCE;
}
@Nonnull
@Override
public Tokenizer getTokenizer(PsiElement element) {
if (element instanceof MyStringLiteralElement) {
return TEXT_TOKENIZER;
}
return super.getTokenizer(element);
}
@Override
public boolean isMyContext(@Nonnull PsiElement element) {
return true;
}
}
In this example, the strategy adds spell checking for string literal elements (using TEXT_TOKENIZER) while falling back to the default behavior for all other element types.
Comments and named identifiers are already handled by the default getTokenizer() implementation.
Custom Tokenizers
If the built-in tokenizers do not meet your needs, you can create a custom Tokenizer implementation:
import consulo.annotation.access.RequiredReadAction;
import consulo.language.psi.PsiElement;
import consulo.language.spellcheker.tokenizer.TokenConsumer;
import consulo.language.spellcheker.tokenizer.Tokenizer;
import consulo.language.spellcheker.tokenizer.splitter.PlainTextTokenSplitter;
import jakarta.annotation.Nonnull;
public class MyCustomTokenizer extends Tokenizer<PsiElement> {
@Override
@RequiredReadAction
public void tokenize(@Nonnull PsiElement element, TokenConsumer consumer) {
String text = element.getText();
// Strip surrounding quotes before spell checking
if (text.length() > 2) {
consumer.consumeToken(element, text.substring(1, text.length() - 1),
false, 1,
new TextRange(0, text.length() - 2),
PlainTextTokenSplitter.getInstance());
}
}
}
Then return your custom tokenizer from getTokenizer() for the appropriate element types.