userdiff: extend Bash pattern to cover more shell function forms

The previous function regex required explicit matching of function
bodies using `{`, `(`, `((`, or `[[`, which caused several issues:

- It failed to capture valid functions where `{` was on the next line
  due to line continuation (`\`).
- It did not recognize functions with single  command body, such as
  `x () echo hello`.

Replacing the function body matching logic with `.*$`, ensures
that everything on the function definition line is captured.

Additionally, the word regex is refined to better recognize shell
syntax, including additional parameter expansion operators and
command-line options.

Signed-off-by: Moumita Dhar <dhar61595@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Moumita Dhar
2025-05-16 20:15:12 +05:30
committed by Junio C Hamano
parent cb96e1697a
commit ea8a71b40d
8 changed files with 128 additions and 8 deletions

View File

@@ -59,20 +59,30 @@ PATTERNS("bash",
"("
"("
/* POSIX identifier with mandatory parentheses */
"[a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
"([a-zA-Z_][a-zA-Z0-9_]*[ \t]*\\([ \t]*\\))"
"|"
/* Bashism identifier with optional parentheses */
"(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+))"
"(function[ \t]+[a-zA-Z_][a-zA-Z0-9_]*(([ \t]*\\([ \t]*\\))|([ \t]+)))"
")"
/* Optional whitespace */
"[ \t]*"
/* Compound command starting with `{`, `(`, `((` or `[[` */
"(\\{|\\(\\(?|\\[\\[)"
/* Everything after the function header is captured */
".*$"
/* End of captured text */
")",
/* -- */
/* Characters not in the default $IFS value */
"[^ \t]+"),
/* Identifiers: variable and function names */
"[a-zA-Z_][a-zA-Z0-9_]*"
/* Shell variables: $VAR, ${VAR} */
"|\\$[a-zA-Z0-9_]+|\\$\\{"
/*Command list separators and redirection operators */
"|\\|\\||&&|<<|>>"
/* Operators ending in '=' (comparison + compound assignment) */
"|==|!=|<=|>=|[-+*/%&|^]="
/* Additional parameter expansion operators */
"|:=|:-|:\\+|:\\?|##|%%|\\^\\^|,,"
/* Command-line options (to avoid splitting -option) */
"|[-a-zA-Z0-9_]+"
/* Brackets and grouping symbols */
"|\\(|\\)|\\{|\\}|\\[|\\]"),
PATTERNS("bibtex",
"(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$",
/* -- */