Zippo
alex said: did significantly worse

Ok, let's think whether "significantly worse" would be a correct definition in that case or not.

If we compare two of your tactics 4213 Striker Madness V5 and 4213 Striker Madness V6 then V6 scored "62.178" points and V5 scored "64.114" points so the difference between them is "1.936" points.

V6 was tested for 2,400 matches and in this case the RNG can be as high as "3" points.

V5 was tested for 4,000 matches and in this case the RNG can be as high as "2.5" points.


The above means that the 4,000 matches test of V5, where it scored "64.114", only revealed to us that the real score of the tactic is somewhere between "61.614" and "66.614" points, which means if we keep retesting V5 for 4,000 matches again and aging then in theory it might hit as low as "59.5" points and as high as "66.5", of course, "59.5" and "66.5" points for V5 would extreme values and the probability of their appearance would be no more than 5% and most of time we'll be getting "61", "62", "63" scores.

Please note, that in normal game when you play a standard season that consists of 38 matches the RNG can be as high as "20" points and that with using the same tactic!!! and now think whether "1.936" points difference can be defined as "significantly worse"? :)
Hi there,

This time let's find out how "Dribble More" PIs impacts the result.




GENERIC BASE TACTIC:
PIs: "Dribble More", "Tackle Harder"



1) All positions have "Dribble More"
2) "Dribble More" has been removed from the Full Backs
3) "Dribble More" has been removed from all positions

Hey,

I'm sure that many of you have noticed that almost in any successful 4231 tactic you can find that the Inside Forwards have "Sit Narrower" PI and the Full Backs have "Stay Wider" PI but what if we remove these PIs and see how it changes the result.

"Say Wider" and "Sit Narrower" PIs



GENERIC BASE TACTIC:
PIs: "Dribble More", "Tackle Harder"



Hey,

Here's another interesting research. Let's find out what the difference between Full Backs "Attack" and Full Backs "Support" when it comes to 4231 formation.


GENERIC BASE TACTIC:
PIs: "Dribble More", "Tackle Harder"


Hey,

No doubts, "Focus Play Down The Flanks" are among the most "mysterious" Team Instructions in FM.

Unfortunately, it's quite hard to measure the impact of these TIs through out our regular tactic testing because even if you take 4,000 matches tests then the RNG still can be as high as 2.5 points which only allows spotting changes that make at least 2.5 points difference.

So we decided to shed light on "Focus Play Down The Flanks" TIs and test them for 32,000 matches, such amount of matches gives us a very small RNG which is no more than 0.5 points.

"Focus Play Down The Flanks" TIs



GENERIC BASE TACTIC:
PIs: "Dribble More", "Tackle Harder"


Here's data for Katana 4231 104p v3.1 tactic with "Overlaps" TIs and without "Overlaps" TIs after 32,000 matches tested.





Judge for yourself.

There's no obvious negative effect from "Overlaps" TIs as it happened for the two previous "generic" tactics.

It seems the "Attacking" mentality or other TIs and PIs might influence on how "Overlaps" TIs perform and in "Katana 4231 104p v3.1" tactic the "Overlaps" TI even give a small positive effect which is about "+0.1" points but don't forget that even 32,0000 matches test also have at least 0.2 RNG.

@alex this tactic - https://fm-arena.com/thread/9578-katana-4231-104p-v3-1-27/ might scored that low due to the fact that the "Overlaps" TIs weren't the only instructions that were changed, I noticed that "Moves Into Channels" PI was also removed from AM, so there's no point in testing it for 32,000 matches because anyway, it isn't possible to "isolate" the absence of "Overlaps" TIs in that case.
alex said: It dropped quite a bit

CBP87 said: I'm more intrigued to see if the score drops and by how much

By tomorrow evening I'll try to run this for 32,000 matches to make sure it didn't hit any nasty RNG.

It could be that some TIs such "Focus Down The Flanks", the Mentality and some PIs make a huge difference for "Overlaps" TIs.

Tomorrow we'll find out it for sure. ;)
Gianaa9 said: I think that using Hold Up on IFs could improve Overlaps’ rating respect to others one, did you used it?
Thank You!


Hi,

Yes, I agree that some PIs such as "Hold Up Ball" or even "Attacking" Mentality might make "Overlaps" TIs shine in such tactics as Katana or COSMOS.

Probably, by tomorrow evening or earlier I'll try to run Katana tactic without "Overlaps" for 32,000 matches and we'll find out whether the Mentality or other ITs and PIs make a difference for the performance of "Overlaps" TIs.
I've updated the 1st post and added data for Inside Forwards(Support) + Full Backs(Attack) combination.
dzek said: @Zippo did you use any PIs in any position?

Hi,

Yes, "Dribble More" and "Tackle Harder" PIs when it was possible to apply them.
Hey,

No doubts, "Overlaps" and "Underlaps" are among the most "mysterious" Team Instructions in FM.

Unfortunately, it's quite hard to measure the impact of these TIs through out our regular tactic testing because even if you take 4,000 matches tests then the RNG still can be as high as 2.5 points which only allows "catching" changes that make at least 2.5 points difference.

So we decided to shed light on "Overlaps" and "Underlaps" TIs and test them for 32,000 matches, such amount of matches gives us a very small RNG which is no more than 0.5 points.



Overlaps & Underlaps TIs







Katana 4231 104p v3.1





GENERIC TACTIC:
Inside Forwards(Support) + Full Backs(Attack)
PIs: "Dribble More", "Tackle Harder"






GENERIC TACTIC:
Inside Forwards(Support) + Full Backs(Support)
PIs: "Dribble More", "Tackle Harder"

Guys, I just want to let you know that we've fixed a small bug in the RNG values, due to the bug they were displaying incorrect values, which were as twice bigger as they should be.

Now, it's been corrected.



Hey there,

We've tested some of the top tactics for 200,000 matches to measure the RNG as precisely as possible.

For example, the score of Katana 4231 104p v3.1 tactic, which was 64.4 points after 4,000 matches, dropped to 61.9 points after 200,000 matches.

The 200,000 matches test helped us to measured the RNG on various distances. For example, for a 4,000 matches distance the RNG is about 2.4 points, which means that "Katana 4231 104p v3.1" tactic hit almost the highest RNG possible for that distance, but that's typical almost for any tactic at the top, speaking other words, a tactic that scored 63 or 64 points in our test almost for sure hit the highest RNG for its shape.

The above means that if we keep retesting "Katana 4231 104p v3.1" tactic for 4,000 matches then we will be getting such scores as 59.6, 60, 61, 62, 63, 64, 64.4 points.

As you can see even after a solid test such as 4,000 matches the score of "Katana 4231 104p v3.1" tactic can go as low as 59.6 points and as high as 64.4 points but of course, the 59.6 and 64.4 points would be a very rare and "extreme" results, rare as unicorns and you have to repeat the test many and many times to get it. On average after 4,000 matches testing "Katana 4231 104p v3.1" tactic you'll be getting more "reasonable" numbers such as 61, 62, 63 points.

Also, we estimated that if you want to get a score that will be as close as "1" point to "the real score" then you have to test a tactic for about 16,000 matches and if you want to get a score that we'll as close as "0.5" point to "the real" score then you have to test a tactic for about 32,000 matches.

You might already noticed that we added "RNG" numbers to the test results, the "RNG" numbers are based on the data that we got after our 200,000 matches RNG test.

I hope the above information and the "RNG" numbers will be useful you.

Cheers.
Khalvito said: Would you elaborate more on what you mean by "buffing" or "nerfing" a team?

It means just increasing/decreasing players' attributes, that's all. :)
Here's another test.

In normal saves top tactics from FM-Arena allows end up a season with 95-100 points when you manage a top club such as Man City, Liverpool, Barcelona, Real Madrid, PSG and other.

We boosted the human controlled teams in FM-Arena tactic testing league so that they were able to reach 100 points (translated into 38 matches) with one of the top tactics.

The following 3 tactics were taken for the test:

4231 Hegemony (through) underlap - https://fm-arena.com/thread/7721-4231-hegemony-through-underlap/

433 BPrinciples - https://fm-arena.com/thread/8791-433-bprinciples/

433 Flat BPrinciples https://fm-arena.com/thread/8799-433-flat-bprinciples/



The default FM-Arena scores:

63 Points - 4231 Hegemony (through) underlap
56 Points - 433 BPrinciples
49 Points - 433 Flat BPrinciples





Human Controlled Teams are boosted to get 100pts (87% of the possible points):




- the difference between "4231 Hegemony (through) underlap" tactic and "433 BPrinciples" tactic reduced from "7" to "2" points.

- the difference between "4231 Hegemony (through) underlap" tactic and "433 Flat BPrinciples" tactic reduced from "14" to "5" points.



As you can see even a tactic that scored 49 pts in FM-Arena testing such as 433 Flat BPrinciples in a normal save still would give you 95 points with top club such as Man City, Liverpool, Barcelona, Real Madrid, PSG and other, and 95 points usually enough for winning the league comfortably.

It's clear that an advantage in the players' quality neglects any weakness in your tactic. The better/worse your players compared with your opponent players, the less your tactic matters.

Cheers.
juwow said: Opting for a mid-table team in the Premier League for example could provide a more realistic challenge and better perception to a tactic more optimal quality. Since mid-table teams often face a mix of strong and weak opponents it could present a balanced test for tactics.

Looking forward to hearing your thoughts on this perspective!

Best regards,
Juwow


Hey, pal.

Normal saves are meant for normal play and joy. :)  They are not designed for testing. Obvious, playing a normal save you can distinct a very poor tactic from a very good tactic but not more than that.

As you can see when you play with a very strong team or a very weak team then your tactic plays a small role and when you play with a mid-table team then you get hit by the RNG

https://fm-arena.com/thread/2713-10-944-matches-tested-fm-rng-measured/

If you check the link above then you'll see that even in such "perfect" league as FM-Arena testing league where all possible RNG factors are eliminated there's still about -/+ 25 points RNG when comes to 38 matches.

Speaking other words, when you play a normal save and you manage a mid-table team then at the end of season with the same tactic your result gets a random adjustment between -25 and +25 points, I'm sure there's no need to explain, that it's very hard to test anything with such high RNG.
Hey there,

It's really important to understand how to translate the results of FM-Arena tactic testing into your normal save.

In FM-Arena tactic testing league all teams are equal in terms of the quality, there're no strong and weak teams in the league.

But when you play a normal save in a normal league then there're always strong and weak teams.

For example, when you play a normal save and manage one of top clubs such as Man City, Liverpool, Barcelona, Real Madrid then using a top tactic from FM-Arena tactic testing you can end up a season with 100 points or more, which means you get about 87% of the possible points.

Now let's look at two following tactics:

4231 Hegemony (through) underlap - https://fm-arena.com/thread/7721-4231-hegemony-through-underlap/

433 BPrinciples - https://fm-arena.com/thread/8791-433-bprinciples/

In FM-Arena Tactic Testing the first tactic "4231 Hegemony (through) underlap" got 63 points and the second tactic "433 BPrinciples" got 56 points so there's a 7 points difference between the tactics.

Please notice, in FM-Arena Tactic Testing those two tactics managed to get only about 55% of the possible points but what if we "buff" the human controlled teams in the testing league to be able to get at least 75% of the possible points and test our tactics to see how their results change.


Human Controlled Teams are buffed to get at least 75% of the possible points:




As you can see, we buffed the human controlled team to be able to get at 75% of the possible points instead of 55% and the difference between the tactics reduced from 7 points to 3.

Now, image what happen to the difference if we further "buff" the human controlled teams to be able to get at least 85% of the possible possible, it isn't hard to predict that the difference will decrease further from 3 points to 0.

The above means that when you manage a top club such Man City, Liverpool, Barcelona, Real Madrid or other then you can use "4231 Hegemony (through) underlap" tactic or "433 BPrinciples" tactic and most of time you won't notice any difference between them despite the 7 points difference in FM-Arena tactic testing.


WOOOT?



Now, let's nerf the human controller teams in the testing league so they will able to get only 27% of the possible points and test our tactics again.




Human Controlled Teams are nerfed to get only 27% of the possible points:




As you can see the difference between the tactics disappeared. :)

The above means that the better/worse your players compared with your opponent players, the less your tactic matters.

We all know that many people prefer testing tactic with the strongest and weakest teams in the league but you can see that such choice is the worst because in this case the quality of the tactic play a very small role.

Cheers.
@svonn

This tactic - https://fm-arena.com/thread/8910-wingplay-target-forward-test-tactic/






( TYPE 1 ) Fast, short and weak striker ( 137 CA ) as Target Forward






( TYPE 2 ) Strong, tall and slow striker ( 140 CA ) as Target Forward







RESULT:



As you can see despite having a higher "star" rating as TF and a higher CA with TYPE 2 attributes, it still produces a worse result.

Btw, there's nothing surprising in that result because as I said, we've tested that many times already, just read through this thread or some others.

If you're getting any different results in your own tests then it's because you test for not enough matches or your testing mythology has issues.

Cheers.
Hey, @svonn.

With only 1 striker as TF, I doubt that the difference can beat the RNG even after 2,400 matches.

Edit the tactic and make both striker to be TF because it's the only way to see a difference that can beat the RNG after 2,400 matches.

Let me know when you're done.

Cheers.
svonn said: Thanks for your reply @Zippo <3 I'll try to keep my points as concise as possible, but if something is unclear, I can elaborate further.

Unforutanlty, I can't take into consideration the result of your test because 500 matches is really nothing and your testing methodology/environment is unknown for me and even if it were know then I still can't be sure that your test were done properly.

As I said it's been tested many times before but I can test it specially for you one more time.

Please, share any tactic that you think would work better with a slow/tall/strong striker instead of a fast/short striker - https://fm-arena.com/board/12-football-manager-2024-tactics-sharing-section/

The tactic will be tested through out our regular testing then I'll test it with the strikers you suggested




and you will be able to see what difference it'll make.