Profile Picture

Backtesting performance improvements - some ideas...

Posted By dnickless 3 Years Ago
Message
Posted Saturday January 04 2014
Hi there almighty gods of RightEdge!

After some pretty heavy and long backtesting/optimization sessions using your lovely application on a 24 CPU machine, I thought it might be worthwile to investigate some minutes into finding ways of how to further improve the backtesting experience. Here are a few places in your code that I stumbled across which - from all I can see - might be rewritten for a slightly better performance.

1) In
PaperTrader.FillOrder(BrokerOrder order, double price, long fillSize, DateTime tickTime, out Fill fill, out string information)
you access the
AccountValue
property several times which results in quite some downstream calculations. I would tend to think that it would be enough to access this property once and reuse the result?!

2) A small enhancement for the admittedly unusual situation when the number of optimizations to run is smaller than the number of available processors: In
DefaultOptimizationPlugin.RunOptimization(SystemRunSettings runSettings)


move the

List runItems = this.CreateRunItems(runSettings);


before the following section

int numThreads = this.ThreadsToUse;
if (numThreads == 0)
{
numThreads = Environment.ProcessorCount;
}


and add something like this:

numThreads = Math.Min(numThreads, runItems.Count);


3a) In
SystemWrapper.RunSystem(SystemData systemData, SharedSystemRunData runData, ServiceFactory brokerFactory)
there is a hardcoded UI progress bar refresh interval of 0.05s. At least on my 24 CPU machine it would appear that this value is a little low and should perhaps be configurable as it seems to cause too much locking and UI work. Based on my observations I would think that this should have a relatively strong impact on the performance.
3b) Before the line where you check if this update interval of 0.05s has passed there is a
using (new Profile("RunSystem.UpdateProgress"))
statement. I am not sure about this one but would dare to suggest you move this after the
if (DateTime.Now.Subtract(minValue).TotalSeconds > 0.05)
.

4) In
FrequencyManager.ProcessBarsDirectly(NewBarEventArgs args)
caching the
args[info.FreqKey.Symbol]
might help a ridiculously little bit, too, but that one surely won't get us too far. Wink

5) In
PlugInManager.InitializeAppDomainStub.InitializeAppDomain(string pluginDirectory, string pluginCacheFilename, string userAppDataPath)
there is a call
PluginManager.LogLoadedAssemblies();
which again will trigger the static constructor of the
PlugInManager
type for every app domain that you create. This causes Log4Net to get loaded several times which seems to take quite a while and might not neccessarily be needed. Do you really need to log the loaded assemblies here or is that because we will need that logger in the created app domain at a later stage anyway?

I'm out of ideas for now but will keep my eyes open. Any feedback on the above topics is highly appreciated.

Best regards


Daniel
Posted Sunday February 02 2014
*bump*
Posted Monday February 03 2014
dnickless (2/2/2014)
*bump*


Hi, thanks a lot for the in-depth suggestions. I've actually made the first two suggested changes last week. I'm now thinking about what to do for the third one.

Thanks!
Daniel
Posted Wednesday February 19 2014
For 3a, I've added a CommonGlobals.SystemRunUpdateRate property where you can control this. You'll need to set it in your system (as opposed to an optimization plugin) because otherwise it won't be in the right app domain.

I'm going to call this done for now. When I get a new build out with these changes, give it a try, let me know how it works, and if number 5 is still something you think I should look at.

Thanks,
Daniel
Posted Sunday February 23 2014
Hi again

Thanks for the new version. 3a seems to make a nice difference. But there's more... Wink We're getting to way smaller things now, too, but I still think they might be worth a look since some of these buggers get called several million times in nested loops per simulation run.

6) This one appears to have a relatively strong effect... In
Common.Dequeue.SetSize(int)
there's a for loop to copy all array items one by one which should be replaced by something much faster like
T[] objArray = new T[newSize];
Array.Copy(this.InnerList, this.Head, objArray, 0, this.Count);

I can imagine there might be other areas in your code base where this could be applied, too.

7) In
FrequencyManager.UpdateTime(DateTime)
there's one if-else block where one part seems to be completely useless - you set a flag or so based on the order of two dates which does not get used afterwards.

8a) Here's a slightly faster (less calls to RoundTime) version of
TimeFrequency.NextRoundedTime(DateTime, TimeSpan)
:

public static DateTime NextRoundedTime(DateTime date, TimeSpan period)
{
var date1 = RoundTime(date, period);

if (date1 <= date)
{
var ret = date1;
var span = period.TotalDays > 1.0 ? TimeSpan.FromDays(1.0) : period;

do
{
date1 = date1.Add(span);
ret = RoundTime(date1, period);
} while (ret <= date);
return ret;
}

return date1;
}



8b) In
TimeFrequency.RoundTime(DateTime, TimeSpan)
you may or may not - a question of style vs performance - want to rewrite the while loop in
time = new DateTime(date.Year, date.Month, date.Day, 0, 0, 0);
while (time.DayOfWeek != DayOfWeek.Monday)
{
time = time.AddDays(-1.0);
}

as
time = time.AddDays(-((int)date.DayOfWeek + 6) % 7);


8c) I was unable to fully understand why your date logic in this area is so complex. It may well be required, but perhaps something like this http://feliperochamachado.com.br/blog/2011/07/rounding-datetime-and-timespan-values-in-c-net/ with a little bit of extra logic around making sure that a one week frequency will always hit Mondays and so on would suffice? Oh well, all this date stuff does not seem to contribute too badly to the performance anyway...

9)
IndicatorManager.NewBar
- avoid duplicate casting checks like this:
var indicator = class2.Indicator as IIndicator;
if (indicator != null)
{
(class2.Indicator as IIndicator).AppendBar(args2.Bar);
}
else
{
var seriesCalculator = class2.Indicator as ISeriesCalculator;
if (seriesCalculator != null)
{
(class2.Indicator as ISeriesCalculator).NewBar();
}
}



Cheers


Daniel
Posted Sunday February 23 2014
How are you determining that these are bottlenecks? Are you running a profiler when RightEdge is running your system?

I ask because I wouldn't expect Dequeue.SetSize() to have any noticeable effect on performance. It should only be called when the queue needs to grow, which should be at most O(log(n)) times, where n is the number of bars.

Of course, I could be missing something...

Thanks,
Daniel
Posted Monday February 24 2014
Yes, I am running a profiler (RedGate ANTS). And quite frankly, that darn thing might be pretty off sometimes, I have to admit. I seem to get slightly different results depending on what angle I look at them. Still, my test case here is a trading system that runs on 100 symbols simultaneously, processing both 5 minutes and hourly bars over just about one year (or was it half a year, can't remember) of data. It's got a handful of indicators built in which might make a difference, too. But if the profiler is right the SetSize method gets called 21.200 times in this scenario... Anyhoo, up to you, of course. It seems like a safe enhancement which will certainly not slow things down.

There's a website (http://waldev.blogspot.ch/2008/05/efficiently-copying-items-from-one.html) where some fellow posted a nice little test that has a basic time measurement built in, too. I copied it and amended it to match our scenario here and the difference in performance was indeed striking... So, Array.Copy() it is! Wink

Oh well, I'll keep digging. Cannot be bothered to wait for my backtests (let alone the optimization runs where every millisecond counts in the end). And I'm sure there's more to optimize even though you guys are doing a really amazing job already.
Posted Monday February 24 2014
10) Now, we're getting a bit more hardcore... Again, please keep in mind that this is based on the profiler's results and Mr. Profiler says that in my simple test run there were over 310 million calls to the overloaded == operator of the Symbol class which results in 16 seconds of pure CPU time - So here are my 5 cents around that particular bit.

In the Symbol class, you have (more or less) the following methods:

public override bool Equals(object obj)
{
Symbol other = obj as Symbol;
if (other != (Symbol) null)
return this.Equals(other);
else
return false;
}

public bool Equals(Symbol other)
{
if (other == (Symbol) null)
return false;
if (object.ReferenceEquals((object) this, (object) other))
return true;
if (this.GetHashCode() != other.GetHashCode() || (!other.name.Equals(this.name) || !other.exchange.Equals(this.exchange) || (!other.currencyType.Equals((object) this.currencyType) || !other.assetType.Equals((object) this.assetType)) || (!other.contract.Equals((object) this.contract) || !other.expirationDate.Equals(this.expirationDate))))
return false;
else
return other.strikePrice.Equals(this.strikePrice);
}

public static bool operator ==(Symbol s1, Symbol s2)
{
if (object.ReferenceEquals((object) s1, (object) null))
return object.ReferenceEquals((object) s2, (object) null);
else
return s1.Equals(s2);
}


In both Equal methods, the first if-statement will result in an unneccessary and costly call to your overloaded == operator implementation which - due to its very own implementation inevitably results in another call to Equals(Symbol) which in turn will call the overloaded equality operator one more time which eventually will return false for a non-null object that got thrown as a parameter at the Equals method it in the first place.

Let me rephrase that since I'm starting to get confused myself and also feel free to take a look at the attached screenshots as a reference.,,

Imagine you have a bog standard
Dictionary<Symbol, SymbolInfo> symbolInfoCache
like you do quite a bit in your code and access it like
symbolInfoCache[someSymbol]
. This will cause the ObjectEqualityComparer that gets used by default by the
Dictionary<Symbol, SymbolInfo>.FindEntry(TKey key)
method to call the
Symbol.Equals(object)
method passing a most likely non null reference to
someSymbol


And now, the whole chain starts ("a" being any of the symbols in the dictionary and "someSymbol" being a non null reference to the symbol that you pass to the dictionary's indexer):

symbolInfoCache[someSymbol] -> Dictionary.FindEntry(someSymbol) -> *now, there's even a loop over some parts of the dictionary* -> ObjectEqualityComparer.Equals(a, someSymbol)

-> a.Equals(object someSymbol)
..-> !=(someSymbol, null)
....-> !(== operator(someSymbol, null))
......-> someSymbol.Equals(Symbol null)
........-> ==(null, null)
......<- true
....<- false
..<- true
..-> a,Equals(Symbol someSymbol)
....-> ==(someSymbol, null)
......-> someSymbol.Equals(Symbol null)
........-> ==(null, null)
......<- true
....<- false
..<- now, a and someSymbol actually get compared and a proper result gets returned
<- proper result.

So to cut a long story short, I would suggest you replace those == and != checks with a version that's either based on "System.Object.ReferenceEquals(other, null)" or alternatively using a cast "(object)other == null".
Also, there are other parts in your code like FrequencyManager.FreqKey.Equals and FrequencyManager.FreqKey.GetHashCode where you're got the same issue in a slightly milder version. It might be worth to rewrite the == operator based on something like http://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx


Dear, dear, dear... Gotta go to bed now. More soon.
Posted Monday February 24 2014
Forgot the attachments...

Attachments
equality_operator1.jpg (263 views, 390.00 KB)
equality_operator2.jpg (256 views, 105.00 KB)
equality_operator3.png (257 views, 140.00 KB)
Posted Monday February 24 2014
Are you using sampling or instrumentation for your profiling? Instrumentation records every method call, which means there's a slight overhead for each method call and when there are lots of calls to very small methods can skew the results. Sampling is much lower overhead and will probably provide a more accurate view of where the performance bottlenecks are.

Thanks,
Daniel


Similar Topics


Reading This Topic


2005-2017 © RightEdge Systems