Skip to content

Regex error on macOS  #4053

@rrsettgast

Description

@rrsettgast

Describe the bug
XML parsing fails on valid nested numeric array input because GEOS pre-validates XML attribute values with generated std::regex patterns. For valid 2D real-array input such as sourceCoordinates="{ { 1575, 2400, 2900 } }", Apple libc++ throws:

std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.

This happens before the actual typed array parser gets to parse the value. The input appears valid; the failure is in regex-based pre-validation.

There is also a secondary diagnostic issue: the user-facing GEOS error message is blank:

***** Unknown
***** Rank
***** Message :

To Reproduce
Steps to reproduce the behavior:

  1. Build GEOS on macOS/aarch64 with Apple clang/libc++.
  2. Run the wave propagation smoke problem:
bin/geosx -i ../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
  1. Observe the blank GEOS error after the first solver is added:
Opened XML file: .../inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml
Solvers: adding AcousticSEM acousticSolver
***** Unknown
***** Rank
***** Message :

Using LLDB with a C++ exception breakpoint shows the real first exception:

std::regex_error:
The complexity of an attempted match against a regular expression exceeded a pre-set level.

Relevant stack trace:

xmlWrapper::validateString
xmlWrapper::stringToInputVariable<Array<double,2>>
xmlWrapper::readAttributeAsType<real64_array2d>
Wrapper<real64_array2d>::processInputFile
Group::processInputFile
ProblemManager::parseXMLDocument

The failing XML attribute is:

sourceCoordinates="{ { 1575, 2400, 2900 } }"
in:

inputFiles/wavePropagation/acouselas3D_Q2_abc_smoke.xml

Expected behavior
Valid 2D real-array XML input should parse successfully. GEOS should not reject valid input because the regex engine exceeds its implementation-specific complexity limit.

If the input is actually malformed, GEOS should report a useful XML parsing error that includes the node, attribute, value, and expected format.

Platform (please complete the following information):
Machine: local macOS/aarch64 machine
Compiler: Apple clang 17.0.0
MPI: Open MPI 5.0.9
GEOS Version: develop

Additional context
The immediate failure occurs in:

src/coreComponents/dataRepository/xmlWrapper.cpp
inside:

xmlWrapper::validateString(...)
at the std::regex_match(...) call.

The array regex is generated by:

src/coreComponents/codingUtilities/RTTypes.cpp
in constructArrayRegex(...).

Suggested fix: avoid using std::regex as the authoritative runtime validator for scalar and especially array input. The typed parser should be authoritative. For arrays, use LvArray::input::stringToArray or a small purpose-built linear parser to validate braces, commas, dimensions, and scalar tokens. Keep the rtTypes format descriptions for diagnostics/schema/docs, but avoid matching nested numeric arrays with large generated regexes at runtime.

There is also a secondary error-reporting bug: xmlWrapper::processInputException() throws a direct InputError, but the top-level geos::Exception catch flushes the global diagnostic object instead of reporting e.what(). In this case that produces the blank ***** Unknown message, obscuring the real regex failure.

NOTE: This error was diagnosed using codex.

Metadata

Metadata

Assignees

Labels

type: bugSomething isn't workingtype: newA new issue has been created and requires attention

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions