nconsistent backtest results on 3 identical PCs – same data/code/settings, huge PnL differences ($100k vs $145k vs $165k over 6 years)

Vik · March 20, 2026, 1:12pm

Hi everyone,

Backtests in Strategy Analyzer give very different results on 3 separate computers, although everything is identical:

NT version: 8.1.6.3 64-bit – same build everywhere
Windows 11, same time zone
Data: db folder copied byte-for-byte + Repair Database on each PC
Scripts: exported/imported via .zip, no changes
Backtest: offline mode, no internet, identical instrument, timeframe, dates, session template, fill type, commissions

Results example: PC1: ~$100,000 net profit PC2: ~$145,000 PC3: ~$165,000

Differences in trade count, entries/exits, equity curve – not small errors. Tried: cache clear, re-import, multiple DB repairs, file hash check (db matches).

What could be the possible causes of such big discrepancies?

Thanks for any ideas!

WaleeTheRobot · March 20, 2026, 2:19pm

Did you verify that the downloaded historical data on all three machines is exactly identical? For example, it’s easy to accidentally download minute data instead of tick data, or select a slightly different date range. Even small differences in the underlying data can lead to large changes in backtest results. Also make sure that all strategy analyzer properties are identical on each PC, including fill type, order resolution, trading hours template, commissions, and any other settings used when running the test.

That said, I personally would not rely too heavily on NT backtesting results. One issue I’ve noticed is that indicator values calculated during a backtest can differ significantly from those calculated during market replay. For example, the EMA value produced by NT in strategy analyzer may not match or is significantly different from the EMA value you see in Replay for the same bar. This suggests that the calculation used during backtesting is not same as the one used during market replay.

Because of this, I prefer not to rely on NT’s built-in indicators for strategy logic. Instead, I calculate those values myself inside the strategy. For example, rather than using the built-in EMA, I maintain my own rolling window of n bars and compute the EMA manually so the calculation is fully deterministic. Although not exact, this resulted in closer values in market replay for me.

My typical backtesting workflow is:

Run the strategy without using NT indicator derived values.
Export the raw bar data and any required fields to a database.
Use Python to simulate bar-by-bar streaming and run the strategy logic there.
Use Market Replay with the same strategy for confirmation.

MrRobotTrades · March 20, 2026, 7:16pm

Thank you for the insights @WaleeTheRobot. That is really helpful and explains some of the discrepancies I have seen while backtesting. I am going to try your method this weekend.

WaleeTheRobot · March 20, 2026, 8:10pm

In addition, I suggest using time series. The reason is that the exported data you use for backtesting tends to match time series over something like a tick or range. For example, if you export that EMA value for the bars, it seems to match more when you run market replay in faster speed against it. In a non-timeseries like a tick chart, it can be off a little, but it can start to compound in replay making the backtest results not reliable.

Another tip, use the values from completed bars. The current bar in backtest is not completed yet in real time. It should only be used to check for potential entry, stop and target. For example, if the current bar being checked has an entry, enter, but also check for stops and targets in the same bar. You don’t know which hit first, but I prioritize the stop if both are hit in that bar.

Vik · March 21, 2026, 9:53am

Thank you for the feedback. I found much of what you wrote to be extremely useful. I would be very grateful if you could share the system architecture for simulating streaming data and executing strategy logic, so that I could build a similar system myself.

WaleeTheRobot · March 21, 2026, 11:02am

You can just have AI modify or build something like this to export the bars. GitHub - WaleeTheRobot/strategy-analyzer-exporter: A NinjaTrader strategy that exports market data with calculated features to DuckDB for machine learning and backtesting analysis. · GitHub

Ideally, you want to have a single source of truth so the features you build and export are sequentially and logically the same. In Python, just have AI recreate exactly the same strategy you are using and just tell it to build a script to read from the database.

EduardoT · May 22, 2026, 2:42pm

I was reading through your thread regarding the discrepancies inside the Strategy Analyzer engines across your three identical machines and noticed you are looking into rewriting the entire infrastructure logic natively inside Python wrappers to bypass it, while decoupling your data into external structures like DuckDB is clean for institutional machine learning, attempting to recreate the entire state machine and sequential execution fills of NinjaScript from scratch in a raw script loop will consume months of testing before you achieve true environment parity

The silent bottleneck behind your discrepancy isn’t just data cache mismatches, it centers on how the local .NET JIT compiler updates floating-point precision layers coupled with structural omissions of strict tick-size rounding controls inside the strategy’s data stream handler, if you are trying to deploy this model into production without building massive logical drift into your live system, pass over your execution wrapper outline here in our private thread and I can show you how to lock down the arithmetic calculations natively