Entries for tag "languages", ordered from most recent. Entry count: 2.
# Parsing Numeric Constants
Fri
16
Dec 2011
As a personal project I started coding a scripting language. First thing I want to do is parsing of integer and floating point numeric constants. My decision about what syntax to support is based on C++ language, but with some modifications.
Integer constant in C++ can be written as:
123 Decimal Starting with non-zero digit 0x7B Hexadecimal Starting with "0x" 0173 Octal Starting with "0"
It can also be suffixed with "u" for unsigned type and "l" for long or "ll" for "long long".
"l" makes no sense in Visual C++ because "long" type is equal to normal "int" - it has 32 bits, even in 64-bit code. So I'd prefer to use "long" as type and "ll" as suffix for 64-bit numbers.
I also don't like the octal form. First, I can't see any use of it. In the whole computer science I've seen only one situation where octal system is used: file permissions in Unix. I didn't see any single use of octal form in C++ code. On the other hand, I think preceding number with zeros shouldn't change its meaning, so the choice of "0" as prefix for octal system (instead of, for example, "0o") is very unfortunate in my opinion.
It would be much more useful if we could place binary numbers in code. Java 7 introduces such syntax with "0b" prefix. It has also another interesting feature I like - it allows underscores in numeric literals so you can make long constants more readable by grouping digits, like "0b0011_1010".
I'd like to support decimal, hexadecimal and binary numbers in my language. Regular expressions that match these are:
[0-9][0-9_]*[Uu]?[Ll]? 0[Xx][0-9A-Fa-f_]+[Uu]?[Ll]? 0[Bb][01_]+[Uu]?[Ll]?
Floating-point numbers are more sophisticated. A constant that uses all possible features might look like this:
111.222e-3f
Question is which parts are required and which are optional? It may seem that floating-point numbers and their representation in code is something obvious, but there actually are subtle differences between programming languages. "111" is obviously an integer constant, but is the presence of a dot with no digits on the left, no digits on the right, an exponent part or "f" suffix enough to for a proper floating-point constant?
111.222 C++: OK HLSL: OK C#: OK 111. C++: OK HLSL: OK C#: Error .222 C++: OK HLSL: OK C#: OK 111e3 C++: OK HLSL: OK C#: OK 111f C++: Error HLSL: Error C#: OK
I want to support all these options, so regular expressions that match floating-point constants in my language are:
[0-9]+[Ff] [0-9]+([eE][+-]?[0-9]+)[Ff]? [0-9]+\.[0-9]*([eE][+-]?[0-9]+)?[Ff]? \.[0-9]+([eE][+-]?[0-9]+)?[Ff]?
Comments | #languages #compilers Share
# Technology for Data Processing
Mon
04
Oct 2010
When doing some engineering work on a computer, no matter if gamedev or any other field, there is sometimes a need to process or visualize some tabular data, especially numbers, e.g. statistics about performance or something gathered during program execution. What technology is best for this purpose? At the moment I know about following solutions:
Spreadsheet software, like Microsoft Excel or OpenOffice Calc. They can import CSV files and draw great variety of plots, but for more advanced data processing it would be useful to have something more like a programming language.
At the other end of the spectrum, we can write normal C++ or C# programs to do the work. It can be hard though as we have to code everything on our own, including data structures, loading files and drawing plots. It would be nice to use some higher-level, scripting language with rich standard library, built-in data structures and the like.
Any scripting language can do this. For example, PHP (which I know the best) can be used as a normal scripting language, not only in connection with a web server. All in all, its name means "PHP Hypertext Prerocessor", because it's suited for processing strings and text files.
A technology designed specially for the purpose of crunching numbers is Matlab and its free alternative - Scilab. It provides its own programming language with convenient built-in data types like matrix and is also able to draw plots.
Python is something in between - a normal scripting language with weird but nice syntax and a language-level support for operations like array slicing and complex numbers. It looks like many scientists and engineers use it for computation and data analysis, because there are Python libraries designed for this, like Numpy and Scipy.
There is also The R Project metioned at the Nick Darnells' Blog. It looks like another environment with its own, different programming language, designed especially for statistical computing. It can also load data from files and draw plots.
And finally, some tasks can be accomplished in environments that allow playing around with computer graphics, like EvalDraw (with its own, C-like simple programming language) or Processing (with a language based on Java).
So there are many possibilities in this subject. I've played a bit with all of them at some time, but obviously learning chosen one thoroughly would require much time and effort. So maybe you can help me decide? Which solution would you recommend? Personally I feel a little more convinced to Python, because it is a general purpose language that I can use in different fields too, like coding Blender plugins.