Skip to content

Proposal: Reducing size of library (WIP) #372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
angularsen opened this issue Jan 13, 2018 · 17 comments
Closed

Proposal: Reducing size of library (WIP) #372

angularsen opened this issue Jan 13, 2018 · 17 comments

Comments

@angularsen
Copy link
Owner

angularsen commented Jan 13, 2018

NOTE: This is work in progress and early thoughts, but input is most welcome.

Motivation

UnitsNet nuget is nearing the 1MB mark and this is probably a lot bigger than most consumers would expect, since most users only work with a small set of quantities and units.

Proposal 1 - Core nuget + quantity nugets + meta nugets

This tries to achieve a kind of extension model, where new quantities are added via nugets - potentially 3rd party quantities.

  • Move all shared code into UnitsNet.Core (UnitSystem, UnitConverter, QuantityParser, UnitFormatter, QuantityValue etc..)
  • Repurpose our current UnitsNet nuget to a meta package that brings in the most common units (think 80/20, this is what we want most people to use)
  • Add nugets for all quantities UnitsNet.Quantities.Length, UnitsNet.Quantities.Mass etc. (around 80 of these now), which has the Length struct type (Length.g.cs and Length.extra.cs files) and any related extension methods (NumberToLengthExtensions), with a dependency on UnitsNet.Core
  • Add metapackage nugets to bring in a set of quantities for an engineering/technical domain, such as UnitsNet.Meta.ElectricalEngineering, UnitsNet.Meta.AudioEngineering, UnitsNet.Meta.PetroleumEngineering, UnitsNet.Meta.ChemicalEngineering

Concerns:

  • Discoverability is hurt by moving code out into separate packages, so you need to browse packages to see what is possible unless you add a metapackage that brings in all or the most common quantities
  • A lot of nugets (80 for quantities plus a few meta packages to bring in groups of quantities)
  • Extension model with static types is tricky. How can we provide operator overloads such as Force = Pressure * Area, when the user may not have added all three nuget packages? Should there be a nuget package for the Force+Pressure+Area combination? Can we move operator overloads as well as interfaces and enums for all units to the core nuget, while keeping the quantity implementations and the bulk of the quantity code in separate nugets? Need to think of something clever here.

Proposal 2 - Use extension methods and builder pattern to reduce member count per quantity

Instead of this:

Length l = Length.FromMeters(1);
Length l = Length.FromKilometers(1);
Length l = Length.FromMillimeters(1);

We could do something like this:

Length l = Length.From.Meters(1);
Length l = Length.From.Kilo.Meters(1);
Length l = Length.From.Milli.Meters(1);

The idea is to move all the prefix code (milli, kilo, mega etc) to a base type #371 for reuse across quantities. We can achieve this by the builder pattern described above using inheritance or extension methods.

A similar approach could be made with conversion properties:

double mm = myLength.Milli.Meters; // Milli effectively multiplies the Meters result by 1e3

Concerns:

  • The syntax is awkward and not very readable
  • Code size reduction is only significant for quantities with a lot of prefixes
  • Prefixes is not always as simple as multiplying by 1e3 etc. We have 3 logarithmic quantities AmplitudeRatio, PowerRatio and Level where the arithmetic is entirely different.

Proposal 3 - Move number extension methods to separate nuget

I just realized we have a ton of number extension methods, and this part of the library is something I think many simply don't use and could make sense to make opt-in.

For every unit, we have an extension method like Length l = 1.cm(); and Mass m = 85.kg();. To support the four number types (int, long, double, decimal) we currently care about, we have 4 overloads of this method. And to support nullable like int? nullable = 10; Length? l = nullable.cm(); , we double that count. This means, for our 600+ units today there is now 8 x 600 extension methods. I would be very interested to see the diff in size when removing this, but I think it can be pretty significant since it seems it is mainly our high method/property count that causes the big size and not so much the amount of code inside each of them.

Update 2018-01-20:
I just tried a Release build locally before and after removing all the number extensions code. The difference is a STAGGERING 939 vs 475 kB ! I think we have found our low hanging fruit to reduce code size.

Besides adding this as a separate nuget, we can also consider using the QuantityValue type to implicitly cast from multiple number types per extension method, to avoid the method overload explosion.

@angularsen angularsen changed the title Proposal: Reducing size of library Proposal: Reducing size of library (WIP) Jan 13, 2018
@0xferit
Copy link
Contributor

0xferit commented Jan 15, 2018

Regarding proposal 1: I think it will be too troublesome both for developers and users to have lots of packages. Plus as you said, where does a conversion belong to?

Regarding proposal 2: I liked it, especially the resulting syntax. I don't find it awkward.

Also, I wonder if we could develop an algorithm to make conversions without defining each valid conversion.
For example, program knows that it has Mass, Volume and Density. So in theory, it has enough information to deduce that Mass/Volume yields Density.

@angularsen
Copy link
Owner Author

Interesting that you liked proposal 2, it's not off the table then :-)
As for auto generating operator overloads, see @eriove 's work in progress PR #275.

@angularsen
Copy link
Owner Author

Added proposal 3.

@0xferit
Copy link
Contributor

0xferit commented Jan 19, 2018

#275 looks interesting...

@angularsen
Copy link
Owner Author

I just tried a Release build locally before and after removing all the number extensions code. The difference is a STAGGERING 475 vs 939 kB ! I think we have found our low hanging fruit to reduce code size.

Besides adding this as a separate nuget, we can also consider using the QuantityValue type to implicitly cast from multiple number types per extension method, to avoid the method overload explosion.

@angularsen
Copy link
Owner Author

angularsen commented Feb 3, 2018

Another option to moving number extension methods out is to drastically cut down on them.

I count 10 extension methods per unit to cover int, long, double, float, decimal and their nullable counterparts. This is kind of ridiculous, at least considering the binary size it take up. I argue we can keep int and double and let the rest of them go, since I believe the main usecase here is typing the numbers out yourself like 15.Kilograms() and you basically need integers and numbers with a decimal point so those two types should do. I don't see any good usecase for nullable types here, since this whole mess is just minor syntactic sugar over Mass.FromKilograms(15) anyway.

decimal would be better had it not required you to type 15m.Kilograms(), which I suspect would be foreign to many.

With only int and double we reduce the amount of extension methods down to 20%, which might be small enough to keep around in the main lib.

@tmilnthorp
Copy link
Collaborator

This may be crazy, but removing all of the properties would save a lot of space in the source files. Not sure about the binaries. The From and As methods are sufficient, no? It seems a bit redundant.

@angularsen
Copy link
Owner Author

I'm open to all suggestions, but I do personally like the properties as they take up a lot less characters in the consuming code. There is without doubt a lot of code that is syntactic sugar, or convenience, that we can consider removing without losing functionality. NumberToXXX extension methods for instance are not necessary at all, and could possibly be a separate nuget if we go with a lean core nuget and additional nugets to add things like number extensions.

@tmilnthorp
Copy link
Collaborator

Indeed! To be honest I like the properties (although not thrilled about a property doing a calculation). They do make the code using Units.NET a lot prettier. But, it depends what priority reducing library size is. I personally think multiple packages are harder to find and use. And what's 1MB these days anyways? Or 10?

@angularsen
Copy link
Owner Author

Absolutely, it's all about perspective. Most use cases would shrug at 1MB, but on the other hand if I'm writing a mobile app and care about deploy size runtime performance, then these things suddenly do matter. I do know of people using it in memory and compute constrained environments. Hopefully we can cater to both worlds somehow.

The readme does state this library is about convenience and not high precision nor high performance - so there is that.

Regarding multiple nugets, the idea is to use a metapackage to link in the core and extra packages that form today's content with today's nuget name. So most users would get it all out of the box, but those who care about size could strip off certain parts of it to keep things lean. I've been thinking a lot about keeping some not-so-widely used quantities outside the core, but we found that conversions between quantities would then be pretty much impossible to implement the way it works today. So the alternative is then to move syntactic sugar like number extensions and getter properties out into separate nugets (would have to be extension methods and not gette properties anymore). Not crazy about the idea, but it would work and probably save off quite a bit of binary size.

Then there is the proposal #2 syntax:
Length.From.Kilo.Meters(1)

It could possibly also handle more complex things likePressure.From.Kilo.Newtons.Per.Square.Kilo.Meters. But I'm not crazy about such syntax either, but maybe that's just me. It could drastically reduce the number of methods and properties.

@angularsen
Copy link
Owner Author

I just happened to check the .dll size not long ago and it was 1.3 MB :-| I'm not sure we did anything wrong either, except adding a bunch more units.

Another thing I recently came upon, is that we have duplicated all our From factory methods to handle nullability, like Length? FromMeters(QuantityValue?) . For 800+ units, this adds quite a number of methods I consider a minor convenience, but also a not very common pattern in .NET. I propose we remove them in vNext.

#460 (comment)

@tmilnthorp
Copy link
Collaborator

I wish the following worked with Windows Runtime Component, otherwise I'd say it's a viable option. Perhaps it might not matter given the WRC separation efforts, but then the API would be inconsistent...

Define a global prefixes class with const values:

    public static class Prefixes
    {
        public const double Exa = 1e18;
        public const double Peta = 1e15;
        public const double Tera = 1e12;
        public const double Giga = 1e9;
        public const double Mega = 1e6;
        public const double Kilo = 1e3;
        public const double Hecto = 1e2;
        public const double Deca = 1e1;
        public const double Deci = 1e-1;
        public const double Centi = 1e-2;
        public const double Milli = 1e-3;
        public const double Micro = 1e-6;
        public const double Nano = 1e-9;
        public const double Pico = 1e-12;
        public const double Femto = 1e-15;
        public const double Atto = 1e-18;
    }

Then have a class per unit that defines the conversion functions along with a static property for each one. Let's look at Length (Length2 for ease of writing test code 😄)

    public class Length2
    {
        public double Value { get; }
        public LengthUnit Unit { get; }

        public Length2( double value, LengthUnit unit)
        {
            Value = value;
            Unit = unit;
        }

        public abstract class LengthUnitClass
        {
            public abstract bool IsBase { get; }
            public abstract LengthUnit Unit { get; }

            public Length2 ToBase( Length2 value )
            {
                var inBase = ToBase( value.Value );
                return new Length2( inBase, Length2.BaseUnit );
            }

            public Length2 FromBase( Length2 value )
            {
                if( value.Unit != Length2.BaseUnit )
                    throw new InvalidOperationException( "The given value is not in base units." );

                var inTarget = FromBase( value.Value );
                return new Length2( inTarget, this.Unit );
            }

            protected abstract double ToBase( double value );
            protected abstract double FromBase( double value );

            public static Length2 operator *( double left, LengthUnitClass right )
            {
                return new Length2( left, right.Unit );
            }
        }

        public class MetersUnitClass : LengthUnitClass
        {
            public override bool IsBase => true;
            public override LengthUnit Unit => LengthUnit.Meter;

            protected override double ToBase( double value )
            {
                return value * 1.0; // Generated from json
            }

            protected override double FromBase( double value )
            {
                return value / 1.0; // Generated from json
            }
        }

        public class YardsUnitClass : LengthUnitClass
        {
            public override bool IsBase => false;
            public override LengthUnit Unit => LengthUnit.Yard;

            protected override double ToBase( double value )
            {
                return value * 0.9144; // Generated from json
            }

            protected override double FromBase( double value )
            {
                return value / 0.9144; // Generated from json
            }
        }

        public Length2 ToUnit( LengthUnit targetUnit )
        {
            var unitClass = GetUnitClass( this.Unit );
            var targetUnitClass = GetUnitClass( targetUnit );

            var inBaseUnits = unitClass.ToBase( this );
            return targetUnitClass.FromBase( inBaseUnits );
        }

        public double ToDouble()
        {
            return Value;
        }

        public double ToDouble( LengthUnit targetUnit )
        {
            var inTargetUnits = ToUnit( targetUnit );
            return inTargetUnits.Value;
        }

        private static LengthUnitClass GetUnitClass( LengthUnit unit )
        {
            switch( unit )
            {
                case LengthUnit.Meter: return Length2.Meters;
                case LengthUnit.Yard: return Length2.Yards;
                default:
                    throw new InvalidOperationException( "Invalid unit." );
            }
        }

        public static Length2 operator *( Length2 left, double right )
        {
            return new Length2( left.Value * right, left.Unit );
        }

        public static LengthUnit BaseUnit => LengthUnit.Meter;
        public static MetersUnitClass Meters => new MetersUnitClass();
        public static YardsUnitClass Yards => new YardsUnitClass();
    }

Which would let you do the following:

            var l1 = 3.0 * Length2.Meters;
            var l2 = 3.0 * Prefixes.Kilo * Length2.Meters;
            var l3 = 3.0 * Length2.Meters * Prefixes.Kilo;

            var inYards = l1.ToUnit( LengthUnit.Yard );
            var inYardsDouble = l1.ToDouble( LengthUnit.Yard );

@angularsen
Copy link
Owner Author

Interesting approach! It does require changing to class though.
I assume operator overloads are what blocks this from working in WRC? And yeah, I think having different APIs would be painful both for consumers and maintainers of the lib.

@tmilnthorp
Copy link
Collaborator

I think it may actually be fine for struct.

But yes, operator overloads are the blocking factor 😞

@angularsen
Copy link
Owner Author

V4 branch already addresses a lot of this and reduced size down to about 40%. I'm closing this issue and we will instead revisit this topic with a new, fresh perspective later if we still want to reduce yet more.

One of the topics we've discussed lately is removing the static factory methods (Length.FromMeters(double)) and conversion getter properties (myLength.Meters), since we have 800 units and one of these for each unit it really adds up. This is syntactic sugar that costs significant bytes and we could possibly offer it through extensions in a separate nuget package.

@jostFT
Copy link

jostFT commented Apr 12, 2022

Has this been improved upon? Seeing as the package is still called "lightweight", but the package size is now:
image

Just checking :)

@angularsen
Copy link
Owner Author

angularsen commented Apr 12, 2022

But it WAS lightweight at some point 🙈😅
No, it is still a problem and I would not hold my breath. I don't have the time to pursue this myself, and no one else is championing it at the time.

A few possibilities:

Move from struct to class to support inheritance.

This should allow us to remove some of the duplicated code, but I'm not sure exactly how much we gain/lose on this.

#875 (comment)

Proposal 1 - Core nuget + quantity nugets + meta nugets

This looks more realistic as we have done improvements to replace the use of QuantityType enum and enums like LengthUnit with strings instead, to support external units.

Here be dragons though.

Simpler - Omit entire quantities

Keep Length, Mass etc, but remove the less used ones.

This is easier, since the full unit enums are still valid and available and
Length.FromEsotericUnit(1) is known to exist.

This nuget could provide conversions like Length * Length = Area, but anything beyond that you would have to write yourself.

It seems doable, but I haven't really looked into it. A propotype to verify the idea would be helpful.

Harder - Omit only some units in a quantity

We have a bunch of weird, little-used units, but this seems much harder to tackle while still preserving static type safety. I haven't thought this one through yet, but my gut says it's not going to be fun.


So, if you are motivated to take a stab at this, or maybe see other solutions, then I'm happy to assist!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants